ADP–based control of a two–wheeled self–balancing mobile robot

Main Article Content

Vladimir Stojanović
Vladimir Djordjević
Ljubiša Dubonjić
Marko Nikolić

Abstract

This study investigates optimal control for a two-wheeled, self-balancing mobile robot whose dynamics is unknown. The objective is to achieve asymptotic control and rejection of disturbances while minimizing a certain index of performances. An approximate optimum controller may be iteratively learnt online utilizing quantifiable input and output data by combining the internal model concept with adaptive dynamic programming (ADP). States that cannot be measured directly are additionally recreated using this information. The ADP method is used to solve the discrete-time algebraic Riccati problem iteratively. The efficacy and viability of the suggested approach are shown by the simulation results.

Article Details

How to Cite
[1]
V. Stojanović, V. Djordjević, L. Dubonjić, and M. Nikolić, “ADP–based control of a two–wheeled self–balancing mobile robot”, ET, vol. 3, no. 4, Dec. 2024.
Section
Original Scientific Papers

References

R.S. Sutton and A.G. Barto, “Reinforcement learning: An introduction”, MIT Press, Cambridge, Massachusetts (USA), (2020)

B. Tao and Z. P. Jiang, “Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design”, Automatica, Vol. 71, pp. 348-360, https://doi.org/10.1016/j.automatica.2016.05.003, (2016)

M. Roozegar, М. Ј. Mahjoob, and М. Jahromi, “Optimal motion planning and control of a nonholonomic spherical robot using dynamic programming approach: simulation and experimental results”, Mechatronics, Vol. 39, pp. 174-184, https://doi.org/10.1016/j.mechatronics.2016.05.002, (2016)

J. H. Zhu, X. S. Ge, and M. Z. Wang, “Adaptive dynamic programming method for attitude control of three-axis spacecraft”, Journal of Beijing University of information science and technology: natural science edition, (2018)

F. L. Lewis and K. G. Vamvoudakis, “Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data”, IEEE Trans. Syst. Man Cybern. B, Cybern,Vol. 41(1), pp. 14–25, https://doi.org/10.1109/TSMCB.2010.2043839, (2011)

M. T. Rodriguez and S. P. Banks, “Linear, time-varying approximations to nonlinear dynamical systems”, Berlin Springer (Germany), (2010)

M. Mynuddin and W. Gao, “Distributed predictive cruise control based on reinforcement learning and validation on microscopic traffic simulation”, IET Intelligent Transportation Systems, Vol. 14(5), pp. 270-277, https://doi.org/10.1049/iet-its.2019.0404, (2020)

A. Cavallo, G. de Maria, C. Natale, and S. Pirozzi, “Slipping detection and avoidance based on Kalman filter”, Mechatronics Vol. 24(5), pp. 489–499, https://doi.org/10.1016/j.mechatronics.2014.05.006, (2014)

W. Gao, M. Huang, Z. P. Jiang, and T. Chai, “Sampled-data-based adaptive optimal output-feedback control of a 2-degree-of-freedom helicopter” IET Control Theory & Applications, Vol. 10, pp. 1440-1447, https://doi.org/10.1049/iet-cta.2015.0977, (2016)

L. Guo, S. A. A. Rizvi, and Z. Lin. “Optimal control of a two-wheeled self-balancing robot by reinforcement learning”, International Journal of Robust and Nonlinear Control, Vol. 31(6), pp. 1885–1904, https://doi.org/10.1002/rnc.5058, (2021)

T. Chen and B. A. Francis, “Optimal sampled-data control systems”, Springer-Verlag, New York (USA), (1995)

G. Hewer, “An iterative technique for the computation of the steady state gains for the discrete optimal regulator”, IEEE Transactions on Automatic Control, Vol. 16(4), pp. 382–384, https://doi.org/10.1109/TAC.1971.1099755, (1971)

K. J. Astrom and B. Wittenmark, “Adaptive control”, Dover Publishing Inc., Mineola, New York (USA), (1994)

Most read articles by the same author(s)