[en] This study explores the reinforcement learning (RL) approach to constructing attitude control strategies for a LEOsatellite with flexible appendages. Attitude control system actuated by a set of three reaction wheels is considered.The satellite is assumed to move in a circular low Earth orbit under the action of gravity-gradient torque, randomdisturbance torque, and oscillations excited in flexible appendages. The control policy for rest-to-rest slew maneuversis learned via the Proximal Policy Optimization (PPO) technique. The robustness of the obtained control policy isanalyzed and compared to that of conventional controllers. The first part of the study is focused on problem formulationin terms of Markov Decision Processes, analysis of different reward-shaping techniques, and finally training the RL-agent and comparing the obtained results with the state-of-the-art RL-controllers as well as with the performance ofa commonly used quaternion feedback regulator (Lyapunov-based PD controller). We then proceed to consider thesame spacecraft with flexible appendages added to its structure. Equations of excitable oscillations are appended tothe system and coupling terms are added describing the interactions between the main rigid body and the flexiblestructures. The dynamics of the rigid spacecraft thus becomes coupled with that of its flexible appendages and thecontrol strategy should change accordingly in order to prevent actions that entail excitation of oscillation modes.Again PPO is used to learn the control policy for rest-to-rest slew maneuvers in the extended system. All in all,the proposed reinforcement learning strategy is shown to converge to a policy that matches the performance of thequaternion feedback regulator for a rigid spacecraft. It is also shown that a policy can be trained to take into accountthe highly nonlinear dynamics caused by the presence of flexible elements that need to be brought to rest in the requiredattitude. We also discuss the advantages of the reinforcement learning approach such as robustness and ability of onlinelearning pertaining to the systems that require a high level of autonomy
Centre de recherche :
Interdisciplinary Centre for Security, Reliability and Trust (SnT) > ARG - Automation & Robotics
Disciplines :
Ingénierie aérospatiale
Auteur, co-auteur :
MAHFOUZ, Ahmed ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Automation
Valiullin, Ayrat
Lukashevichus, Alexey
Pritykin, Dmitry
Co-auteurs externes :
yes
Langue du document :
Anglais
Titre :
Reinforcement Learning for Attitude Control of a Spacecraft with Flexible Appendages
Date de publication/diffusion :
septembre 2022
Nom de la manifestation :
73rd International Astronautical Congress
Organisateur de la manifestation :
International Astronautical Federation
Lieu de la manifestation :
Paris, France
Date de la manifestation :
from 18-09-2022 to 22-09-2022
Titre de l'ouvrage principal :
IAC 2022 congress proceedings, 73rd International Astronautical Congress (IAC)
Maison d'édition :
International Astronautical Federation, Paris, France
Edition :
73rd
Peer reviewed :
Editorial reviewed
Projet FnR :
FNR14302465 - Development Tool For Autonomous Constellation And Formation Control Of Microsatellites, 2019 (01/09/2020-31/08/2023) - Holger Voos
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. The MIT Press, second edition, 2018.
D. Cellucci, Nick B. Cramer, and Jeremy D. Frank. Distributed Spacecraft Autonomy.
Seongin Na, Tomáš Krajník, Barry Lennox, and Far-shad Arvin. Federated reinforcement learning for collective navigation of robotic swarms, 2022.
F. Vedant, J.T. Allison, M. West, and A. Ghosh. Reinforcement learning for spacecraft attitude control. In Proceedings of the International Astronautical Congress, IAC, volume 2019-October, 2019.
Vanessa Tan, John Leur Labrador, and Marc Caesar Talampas. Mata-rl: Continuous reaction wheel attitude control using the mata simulation software and reinforcement learning. In Proccedings of 35th Annual Small Satellite Conference, 2021.
Jacob G. Elkins, Rohan Sood, and Clemens Rumpf. Bridging reinforcement learning and online learning for spacecraft attitude control. Journal of Aerospace Information Systems, 19(1):62-69, 2022.
Daniel Alazard, Christelle Cumer, and Khalid Tantawi. Linear dynamic modeling of spacecraft with various flexible appendages and on-board angular momentums. In 7th International ESA Conference on Guidance, Navigation & Control Systems (GNC 2008), pages 1-14, Tralee, IE, 2008.
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. ArXiv, abs/1707.06347, 2017.
Bong Wie and Peter M. Barba. Quaternion feedback for spacecraft large angle maneuvers. Journal of Guidance, Control, and Dynamics, 8(3):360-365, 1985.
I A Courie, Francesco Sanfedino, and Daniel Alazard. Worst-case pointing performance analysis for large flexible spacecraft. ArXiv, abs/2106.01893, 2021.
Bong Wie, Haim Weiss, and Ari Arapostathis. Quarternion feedback regulator for spacecraft eigenaxis rotations. Journal of Guidance, Control, and Dynamics, 12(3):375-380, 1989.