Velocity Regulation of 3D Bipedal Walking Robots with Uncertain Dynamics Through Adaptive Neural Network Controller
Abstract
This paper presents a neural-network based adaptive feedback control structure to regulate the velocity of 3D bipedal robots under dynamics uncertainties. Existing Hybrid Zero Dynamics (HZD)-based controllers regulate velocity through the implementation of heuristic regulators that do not consider model and environmental uncertainties, which may significantly affect the tracking performance of the controllers. In this paper, we address the uncertainties in the robot dynamics from the perspective of the reduced dimensional representation of virtual constraints and propose the integration of an adaptive neural network-based controller to regulate the robot velocity in the presence of model parameter uncertainties. The proposed approach yields improved tracking performance under dynamics uncertainties. The shallow adaptive neural network used in this paper does not require training a priori and has the potential to be implemented on the real-time robotic controller. A comparative simulation study of a 3D Cassie robot is presented to illustrate the performance of the proposed approach under various scenarios.
I Introduction
Model-based controllers for 3D walking robots have received considerable attention from the robotics community due to their ability to take full advantage of the natural hybrid dynamics to achieve dynamic locomotion. Existing approaches, however, may fail to stabilize the robot or accurately track the desired behaviors under model uncertainties. Parameters that can significantly affect system dynamics include the torso’s mass, torso’s center of mass position, to name a few. On a real robot, these situations may occur when mounting additional equipment in the robot, adding a temporal load, or could be the result of wearing out of mechanical parts like joints or linkages due to the continuous usage of the robot.
Another important topic in dynamic locomotion is the velocity regulation of the walking robots. Accurate and stable velocity tracking is critical in applications of motion planning, object tracking, and human-robot interactions. For instance, an accompanying robot walking next to a person needs to regulate its velocity to maintain a close distance to that person. Different methods have been proposed to address the velocity tracking problem through feedback controllers based on classical control techniques [1, 2, 3, 4, 5], and more recently based on various machine learning methods [6, 7]. However, these controllers do not take the uncertainties in the robot dynamics into consideration, resulting in noticeably compromised tracking performance in the presence of changes in model properties. For example, in [4], the authors showed that changing mass and inertia parameters by results in significant velocity tracking errors.

To understand and simplify the effect of model uncertainties for the stability and performance of bipedal locomotion, researchers have developed bio-inspired heuristic regulators to enhance the rate of convergence to the desired stable limit cycle [8]. In particular, [8] showed that the convergence of the nominal reference gait to the cyclic gait regime in planar bipedal robots could be accelerated via feedback regulators on top of the nominal trajectory tracking controllers by varying the torso inclination and the step length. This method is in line with the biomechanical approach for keeping stability while walking: when a person is walking and experiences a disturbance, it is a natural reaction to compensate by simply bending the torso or moving their leg [9, 10].
These findings have been extended to 3D bipedal robots by decoupling the motion of the robot into the longitudinal and lateral planes and applying the foot placement regulator—which varies the step length in response to the robot walking velocity—to track velocity in both directions [11, 2]. These heuristically designed regulators often require intensive manual tuning of feedback gains and yield noticeable steady-state tracking errors of velocity when there is a significant change in model parameters.
Recently, several results have addressed the problem of dynamics uncertainty in bipedal robots using adaptive control from different points of view. In [12], the authors present a strategy for the non-collocated adaptive control of underactuated mechanical systems by introducing the concept of virtual control and adaptive virtual constraints to produce stable limit walking cycles for a two-link and three-link robots. However, these results are applied only to planar robots, and its extension to 3D complex robots could be limited by the high complexity of their mathematical models.
In this work, we are specifically interested in addressing uncertainty in the robot dynamics and its effect on the robot’s walking speeds. Therefore, we attempt to provide a general framework to stabilize the robot’s walking limit cycle while tracking a desired average walking speed by employing adaptive neural network-based controllers. Inspired by the effective results of neural network controllers in the adaptive control of robot manipulators, we propose a control structure that combines the advantages of online learning with the robustness of classical feedback controllers to compensate for changes in the dynamic properties of the robot. It is worth mentioning that the controller proposed in this work is conceived as an adaptation technique to modify limit walking cycle gait trajectories already obtained by existing trajectory planning algorithms. Therefore, we do not focus on the process of obtaining such trajectories. Instead, we present a novel structure of adaptive feedback regulators, which augment the nominal gaits trajectories rendering stable walking limit cycles at any desired walking speed within a wide range despite the changes in model parameters.
We further summarize the primary contributions of the present paper as follows:
-
•
A novel velocity tracking controller for bipedal walking is proposed. Through a nonlinear neural network parameterization, the proposed adaptation scheme is implemented online along with the nominal controller with fast convergence to steady-state velocity under various dynamic uncertainties. In addition, note that the proposed adaptation framework is compatible with nominal controllers derived in both model-free and model-based fashions, and its simple structure makes it feasible to implement on real-time controllers.
-
•
The proposed method yields consistent performance against significant uncertainties of dynamic properties. In particular, we show the robust performance of the adaptive controller with pelvis mass experiencing up to 130% of the mass increase, and the center of mass assigned with up to m offset.
-
•
The adaptation scheme does not require any dynamic or kinematic properties of the robot. The prorogation of the adaptive controller only relies on observable states and measurable tracking error.
The remainder of the paper is organized as follows. Section II reviews the basics of hybrid zero dynamics and heuristic regulators for velocity regulation. Section III present our main result of the paper, a novel neural-network based adaptive feedback regulators for dynamics uncertainties, and Section IV shows the improved performance of the proposed approach on 3D Cassie robot in simulation. Conclusions are given in Section V.
II PROBLEM FORMULATION
In this section, we first describe the classical structure of the HZD-based feedback controllers for 3D walking robots. Then we discuss how heuristic regulators can be designed to track velocity under model uncertainties.
II-A HZD-based Feedback Controllers
The HZD framework provides conditions for the existence of provably stable limit walking cycles by enforcing virtual constraints that are invariant through impact. This technique allows synthesizing feedback controllers that realize stable and dynamic locomotion in 2D/3D robots [13, 14].
Virtual Constraints: Let be the vector of the joint coordinates of the robot. The virtual constraints are defined as the difference between the actual and desired outputs of the robot [14]:
(1) |
where is given as a vector of Bézier polynomials parameterized by the coefficients , and is the phase variable that synchronizes all virtual constraints. In this paper, we choose to be the scaled relative time during one walking step, i.e.,
(2) |
where is the duration of one walking step, and is the time at the beginning of the step. Typically, the coefficients of desired outputs are determined via model-based offline gait optimization to achieve different periodic walking motions [15].
II-B Dynamics Uncertainties and Heuristic Regulators
The mismatch between the mathematical model and the real model of 3D bipedal robots leads to the failure of the designed controllers when applied to the real hardware. To prevent from falling, researchers proposed to use heuristically-designed feedback regulators on top of the HZD-based designed controllers to realize asymptotic stability of the walking cycle [16, 17].
In practice, two regulators are mostly used to stabilize the walking motion of a 3D bipedal robot: foot placement and torso regulators [2, 11, 17]. These regulators often use decoupled structures, relating certain joints with a specific desired feature—such as hip velocity or torso inclination—of the robot motion. For example, the foot placement regulator adds an offset to the desired swing hip pitch and swing hip roll joints to regulate the longitudinal and lateral walking speeds, respectively, and prevent the robot from falling. These offsets are determined by
(3) | ||||
(4) |
where and are the average longitudinal and lateral speeds of the robot at the middle of step , , are the reference speeds, and are manually tuned gains. The readers can refer to [2, 11, 17] for additional information about the other regulators.
III Approach
In this section, we propose a non-conventional controller structure using an adaptive neural network to realize stable walking while tracking desired walking speeds in the presence of changes in the model properties.
III-A Motivation
Motivated by the work of [18, 19, 20, 21] in the application of neural networks for the control of nonlinear systems e.g., robotics manipulators, we propose a framework of adaptive neural network-based controllers that compensate the unknown dynamics of the system while tracking the desired velocity. In addition, we show that the proposed network could be seen as a generalization of the heuristic regulators described in Section II-B, and its extension to the case of unknown dynamics. By this, we aim to develop a new general framework for the development of adaptive feedback controllers that render stable limit walking cycles, even when the dynamic properties of the robot change. To validate the proposed method in simulation, we use as our testbed the Cassie-series bipedal robot described in Section II-A.
III-B A Review of Adaptive Neural Network-based Controller
Typical adaptive controllers rely on approximating the unknown dynamics of the dynamical system as the linear combination of unknown parameters of the system. Then, a control feedback law is computed using the approximation of the unknown dynamics based on the estimation of the unknown parameters, which are updated using a close form update rule. In particular, [22] showed that the unknown dynamics of a robotic manipulator could be approximated as
(5) |
where is the approximated unknown dynamics, is a matrix of functions dependent on the state of the robot, and is a vector of parameters. Then, an update law of the form
(6) |
is used to estimate the unknown parameters online, where is a symmetric positive definite matrix, and is the filtered tracking error. However, these controllers are restricted to the prior knowledge of the dynamics structure and extensive system modeling and preliminary analysis are required to compute the regression matrix.
With the emergence of learning methods and the ability of neural networks to approximate complex, nonlinear functions, new neural network-based adaptive control algorithms were proposed [21, 19, 20, 18]. The main advantage of neural network-based controllers is that they can virtually approximate any smooth functions, including the unknown dynamics in a robotic system, without the need for computing a regression matrix. Analogous to equation (5), the unknown dynamics of more complex systems, for which a linear parameterization is not accurate enough, can be effectively approximated using neural networks. Then, the nonlinear parameterization of the unknown dynamics can be obtained through neural networks as
(7) |
where , are estimates of the ideal neural network weights provided by some on-line weight tuning algorithms.
III-C Adaptive Feedback Regulators for Virtual Constraints

In this section, we propose an adaptive feedback regulator for virtual constraints to achieve improved speed tracking under model uncertainties. The proposed regulator will have the following form:
(8) |
where represent the modification in the original trajectory to render a stable walking limit cycle, which can be obtained either from offline optimization [14, 17], or offline training of a neural network policy [7, 6].
We will then use an adaptive neural network based feedback controller to determine , as shown in Fig. 2. Inspired by the results of [8], we focus our analysis on three specific outputs: i) the virtual constraints related to the robot’s joints that control the step length in the longitudinal plane (swing hip pitch angle, swing knee), ii) the virtual constraints related to the joints that control the step length in the frontal plane (swing hip roll angle, stance hip roll angle), and iii) the virtual constraints related with the joints that control torso inclination (stance hip pitch angle, stance knee, stance hip roll angle). Let and be the average longitudinal and lateral speeds of the robot in the middle of step , , be the reference speeds, be the actual and desired torso inclination and angular velocity, we define
(9) | ||||
where , , represent the modification that compensates the unknown dynamics corresponding to each of the three decoupled systems respectively. A detailed structure for the decoupled controllers is presented in Fig. 3, each of which resembles a feedback PD controller and a feed-forward neural network term to track desired behaviors under uncertainties. This adaptive controller structure enhances the robustness of the controller since it allows the PD term to keep the system stable while the network is learning to compensate changes in the dynamic properties of the robot or the environment. The details of the structure and update rule of the neural networks that compute will be discussed in the following section.

Finally, we denote that the structure chosen for the adaptive controllers proposed in (9) allows us to generalize the use of the additional regulators to stabilize the walking limit cycle described in Section II. In particular, when the output of the neural network is zero, the adaptive controller renders the structure of the traditional regulators. However, the generalized structure proposed by this adaptive controller does not restrict the regulation of the joint trajectories to only the swing hip pitch angle (as in the traditional approach) but allows the controller to learn which trajectories should be modified in order to achieve the successful regulation of the desired longitudinal and lateral velocity. Section IV illustrates this point in detail through simulation results on the bipedal robot Cassie under various scenarios.
III-D Adaptive Neural Network Structure
As shown in Fig. 3, the inputs of the neural networks are the actual and desired values of the longitudinal and lateral velocity, torso inclination, and torso angular velocity. The outputs are the feedforward terms compensating for the unknown dynamics of the decoupled systems. Each network only has one hidden layer with one thousand neurons. Notice that we can think of the hidden layer neurons as creating a random set of basis functions, and the task of the neural network is to learn the weights on those basis functions that provide the desired compensation as a function of the inputs to the matrix. This is a variant of the adaptive controller proposed in [22], and can be given as a simple delta-rule that only applies to the output weights:
(10) |
where is the weight from the th hidden neuron to the th output, is the error signal for the th output (the difference between the desired and current velocity), is the output of the th hidden neuron, and is the learning rate chosen as . This is structurally the same as (6).
It is important to mention that the neural network does not require training a priori since the learning process is performed online. The output weights are initialized to zero, and the input weights are randomly initialized. We use the encoder initialization scheme from [23] to generate input weights with a broad distribution, ensuring a collection of basis functions that covers the space. This general approach has been used as a model of biological adaptive arm control [24] and as a simple benchmark task controlling an inverted pendulum [25], but here we apply it in a very different context and with a different source of the error signal. Previous applications had always been in the domain of torque control and used the output of a PD controller to generate the error signal.
IV SIMULATION RESULTS
The proposed method is validated in a dynamic simulation of Cassie using Mujoco [26]. This section presents the results of the adaptive neural network based controller when the robot is subject to changes in the dynamic properties of the robot such as variation in the mass and center of mass (COM) position of the robot’s torso. We also present a comparison of the adaptive controller with the traditional HZD controller and HZD-based RL controller. Finally, we test the robustness of the controller when the robot is subject to adversarial forces and walking on uneven terrain. These results of the evaluation of the proposed adaptive controller can be visualized in the supplemental video material accompanying this submission [27].
IV-A Response to changes in the model properties of the robot
First, we tested the response of the controller when the mass of the torso is increased by approximately , and . The original mass of the robot’s torso is ; then, after the increments in mass the total mass of the torso corresponds to , , and respectively. Fig. 4 (a) shows the response for these three cases. Interestingly, we can see that the tracking error of the average speed does not converge to zero immediately, but it takes some time until the controller learns to compensate for the unknown dynamics, which illustrates the on-line learning process of the adaptive controller.

We also tested the controller with changes in different dynamic properties like the position of the center of mass of the pelvis, by adding an offset of , , and in the longitudinal direction. The responses of the adaptive controller for the 3 cases are shown in Fig. 4 (b), where we can see that the controller performs well even under large parameter uncertainties of the robot’s dynamics. Similarly as in the previous test, we can see the learning curve of the controller while the actual walking velocity converges to the desired velocity.
IV-B Comparison with traditional and RL HZD-based controller
To illustrate the significant contribution of the adaptive controller to the speed tracking and stability of the system in the face of model uncertainties, we compared its performance against two HZD-based controllers. Fig. 5 and Fig. 6 show the velocity tracking performance of the adaptive controller when compared with the classic HZD-based controller for tracking fixed desired velocity [17], and with an HZD-based Reinforcement Learning controller for tracking varying velocities [7]. The robot’s pelvis mass is increased by , and an offset of is added to the pelvis’ COM in the longitudinal direction. As shown in Fig. 5 and Fig. 6, the existing two HZD-based controllers yield significantly large steady-state tracking error when the dynamics properties of the robot change relative to the model used in the design of these controllers. With the adaptive controller, however, the actual walking velocities converge to a wide range of desired velocities under these changes. More importantly, the convergence is achieved through online learning of the model and does not require a priori training.


To further demonstrate the effectiveness of the adaptive controller to compensate for the unknown dynamics when tracking desired walking speeds in different directions we tested the controller for different walking speeds in diagonal directions. Fig. 7 shows the performance of the controller when tracking diagonal speed with , and ,

IV-C Robustness
Two tests were performed to evaluate the robustness of the adaptive controller, i) external disturbance rejections, and ii) walking on uneven terrain. To evaluate external disturbance rejection, an adversarial force is applied directly at the robot’s torso in both forward and backward directions 2.5 seconds after the test started. Fig. 8, shows the response of the adaptive controller when an adversarial force of and a force of are applied during 0.1 seconds in the forward and backward directions, respectively. The controller can handle both external forces successfully without falling and, more importantly, recovering the speed tracking performance quickly after the disturbance is applied.
Fig. 9 illustrates the response of the adaptive controller when the robot is walking on uneven terrain. The terrain presents significant irregularities with slopes up to degrees with a maximum height of . The controller adapts successfully to the different terrain changes and keeps close tracking of the desired velocity throughout the whole test.

IV-D Adaptation redundancy
In this subsection, we demonstrate the adaptability of the proposed adaptive controller framework to adapt to different operating conditions taking advantage of the generalized structure of the controller. In particular, Fig. 10 (a) shows the speed tracking performance of the adaptive controller for two cases: 1) both hip and knee joints use the compensation provided by the neural network to regulate the walking speed, and 2) the output of the neural network is forced to zero to test the adaptability of the controller to unknown scenarios. The results of this test are shown in Fig. 10 (b), where we see that for case 1, both stance hip and stance knee joints contribute equally to the compensation of the unknown dynamics while tracking the desired speed of the robot. For case 2, since we force the neural network output corresponding to the hip joint to zero, the adaptive controller learns a different way to compensate for the unknown dynamics of the system by only using the compensation available in the knee joint, which demonstrate the adaptation capabilities of the proposed controller.


V Conclusion
This paper presents a general adaptive neural network-based controller for velocity tracking of 3D bipedal robots under model uncertainties. The proposed adaptive controller builds upon the concept of the virtual constraint of HZD-based controllers to incorporate adaptive trajectory compensation using neural networks to compensate for the unknown dynamics of the system. The result is a structure of simple yet effective adaptive neural network based controllers applied in a decoupled manner to render stable and robust limit walking cycles for effectively tracking desired walking velocity in both longitudinal and lateral directions. The adaptive controller shows the online learning process of the neural network is effective in compensating different changes in the dynamic properties of the system, such as the torso mass and the center of mass position of the torso. Moreover, the controller can learn different adaptation techniques such as using the knee joint instead of the hip joint for compensating the unknown dynamics. Improved performance of disturbance rejections—in the forms of adversarial forces and uneven terrains—are also observed with the proposed controller. The future work will focus on implementing the adaptive feedback controller on actual robots in experiments.
References
- [1] E. Westervelt, J. Grizzle, and C. Wit, “Switching and pi control of walking motions of planar biped walkers,” Automatic Control, IEEE Transactions on, vol. 48, pp. 308 – 312, 03 2003.
- [2] X. Da, O. Harib, R. Hartley, B. Griffin, and J. W. Grizzle, “From 2d design of underactuated bipedal gaits to 3d implementation: Walking with speed tracking,” IEEE Access, vol. 4, pp. 3469–3478, 2016.
- [3] T. Kobayashi, K. Sekiyama, Y. Hasegawa, T. Aoyama, and T. Fukuda, “Virtual-dynamics-based reference gait speed generator for limit-cycle-based bipedal gait,” ROBOMECH Journal, vol. 5, p. NA, 2018.
- [4] C. Chevallereau, G. Abba, Y. Aoustin, F. Plestan, E. R. Westervelt, C. C. De-Wit, and J. W. Grizzle, “RABBIT: a testbed for advanced control theory,” IEEE Control Systems, vol. 23, no. 5, pp. 57–79, Oct. 2003.
- [5] G. M. Gasparri, S. Manara, D. Caporale, G. Averta, M. Bonilla, H. Marino, M. Catalano, G. Grioli, M. Bianchi, A. Bicchi, and M. Garabini, “Efficient walking gait generation via principal component representation of optimal trajectories: Application to a planar biped robot with elastic joints,” IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 2299–2306, 2018.
- [6] X. Da and J. Grizzle, “Combining trajectory optimization, supervised machine learning, and model structure for mitigating the curse of dimensionality in the control of bipedal robots,” The International Journal of Robotics Research, vol. 38, no. 9, pp. 1063–1097, 2019.
- [7] G. A. Castillo, B. Weng, W. Zhang, and A. Hereid, “Hybrid zero dynamics inspired feedback control policy design for 3d bipedal locomotion using reinforcement learning,” 2019.
- [8] Y. Aoustin and A. Formal’sky, “Control design for a biped: Reference trajectory based on driven angles as functions of the undriven angle,” International Journal of Journal of Computer and Systems Sciences, vol. 42, no. 4, pp. 645–662, 2003.
- [9] P. Manoonpong, T. Geng, T. Kulvicius, B. Porr, and F. Wörgötter, “Adaptive, fast walking in a biped robot under neuronal control and learning,” PLOS Computational Biology, vol. 3, no. 7, pp. 1–16, 07 2007.
- [10] S. Aoi, P. Manoonpong, Y. Ambe, F. Matsuno, and F. Wörgötter, “Adaptive control strategies for interlimb coordination in legged robots: A review,” Frontiers in Neurorobotics, vol. 11, p. 39, 2017.
- [11] S. Rezazadeh, C. Hubicki, M. Jones, A. Peekema, J. Van Why, A. Abate, and J. Hurst, “Spring-mass walking with atrias in 3d: robust gait control spanning zero to 4.3 kph on a heavily underactuated bipedal robot,” in ASME 2015 Dynamic Systems and Control Conference. American Society of Mechanical Engineers, 2015.
- [12] M. Gnucci and R. Marino, “On the adaptive control of the acrobot,” in 2019 18th European Control Conference (ECC), June 2019, pp. 3438–3443.
- [13] E. R. Westervelt, J. W. Grizzle, C. Chevallereau, J. H. Choi, and B. Morris, Feedback control of dynamic bipedal robot locomotion. CRC press Boca Raton, 2007.
- [14] A. D. Ames, “Human-inspired control of bipedal walking robots,” IEEE Transactions on Automatic Control, vol. 59, no. 5, pp. 1115–1130, May 2014.
- [15] A. Hereid, C. M. Hubicki, E. A. Cousineau, and A. D. Ames, “Dynamic humanoid locomotion: a scalable formulation for HZD gait optimization,” IEEE Transactions on Robotics, vol. 34, no. 2, pp. 370–387, Apr. 2018.
- [16] J. P. Reher, A. Hereid, S. Kolathaya, C. M. Hubicki, and A. D. Ames, “Algorithmic foundations of realizing multi-contact locomotion on the humanoid robot DURUS,” in the 12th International Workshop on the Algorithmic Foundations of Robotics (WAFR). San Francisco: Springer, Dec. 2016.
- [17] Y. Gong, R. Hartley, X. Da, A. Hereid, O. Harib, J.-K. Huang, and J. Grizzle, “Feedback control of a Cassie bipedal robot: walking, standing, and riding a segway,” American Control Conference (ACC), 2019.
- [18] K. S. Narendra and S. Mukhopadhyay, “Adaptive control using neural networks and approximate models,” IEEE Transactions on Neural Networks, vol. 8, no. 3, pp. 475–485, May 1997.
- [19] F. Lewis, “Neural network control of robot manipulators,” IEEE Intelligent Systems, vol. 11, no. 03, pp. 64–75, June 1996.
- [20] R. M. Sanner and J. E. Slotine, “Stable adaptive control of robot manipulators using “neural” networks,” Neural Computation, vol. 7, no. 4, pp. 753–790, July 1995.
- [21] F. Chen and H. Khalil, “Adaptive control of nonlinear systems using neural networks,” International Journal of Control, vol. 55, no. 6, pp. 1299–1317, 1992.
- [22] J.-J. E. Slotine and W. Li, “On the adaptive control of robot manipulators,” The International Journal of Robotics Research, vol. 6, no. 3, pp. 49–59, 1987.
- [23] C. Eliasmith and C. H. Anderson, Neural engineering: Computation, representation, and dynamics in neurobiological systems. Cambridge, MA: MIT Press, 2003.
- [24] T. DeWolf, T. C. Stewart, J.-J. Slotine, and C. Eliasmith, “A spiking neural model of adaptive arm control,” Proceedings of the Royal Society B, vol. 283, no. 48, 2016.
- [25] T. C. Stewart, T. DeWolf, A. Kleinhans, and C. Eliasmith, “Closed-loop neuromorphic benchmarks,” Frontiers in Neuroscience, vol. 9, 2015.
- [26] E. Todorov, T. Erez, and Y. Tassa, “MuJoCo: a physics engine for model-based control,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct. 2012, pp. 5026–5033.
- [27] “Simulation results for Cassie in MuJoCo,” https://youtu.be/DAHk9-GFS0k, accessed: 2020-07-31.