Space Robotics and Automation; Reinforcement Learning; Deep Learning in Grasping and Manipulation
Résumé :
[en] Extraterrestrial rovers with a general-purpose robotic arm have many potential applications in lunar and planetary exploration. Introducing autonomy into such systems is desirable for increasing the time that rovers can spend gathering scientific data and collecting samples. This work investigates the applicability of deep reinforcement learning for vision-based robotic grasping of objects on the Moon. A novel simulation environment with procedurally-generated datasets is created to train agents under challenging conditions in unstructured scenes with uneven terrain and harsh illumination. A model-free off-policy actor-critic algorithm is then employed for end-to-end learning of a policy that directly maps compact octree observations to continuous actions in Cartesian space. Experimental evaluation indicates that 3D data representations enable more effective learning of manipulation skills when compared to traditionally used image-based observations. Domain randomization improves the generalization of learned policies to novel scenes with previously unseen objects and different illumination conditions. To this end, we demonstrate zero-shot sim-to-real transfer by evaluating trained agents on a real robot in a Moon-analogue facility.
Disciplines :
Sciences informatiques Ingénierie aérospatiale
Auteur, co-auteur :
ORSULA, Andrej ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Space Robotics
Bøgh, Simon; Aalborg University > Department of Materials and Production > Robotics and Automation
OLIVARES MENDEZ, Miguel Angel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Space Robotics
MARTINEZ LUNA, Carol ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Space Robotics
Co-auteurs externes :
yes
Langue du document :
Anglais
Titre :
Learning to Grasp on the Moon from 3D Octree Observations with Deep Reinforcement Learning
Date de publication/diffusion :
23 octobre 2022
Nom de la manifestation :
2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Lieu de la manifestation :
Kyoto, Japon
Date de la manifestation :
23/10/2022 → 27/10/2022
Manifestation à portée :
International
Titre de l'ouvrage principal :
Proceedings of 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
T. Zhang et al., "The progress of extraterrestrial regolith-sampling robots, " Nature Astronomy, vol. 3, pp. 487-497, Jun. 2019.
R. W. Moses and D. M. Bushnell, Frontier In-Situ Resource Utilization for Enabling Sustained Human Presence on Mars, ser. NASA technical memorandum. NASA, Langley Research Center, Apr. 2016.
J. Liu et al., "Landing Site Selection and Overview of China's Lunar Landing Missions, " Space Science Reviews, vol. 217, no. 6, Feb. 2021.
S. Schierholz and J. Finch, "NASA Selects Companies to Collect Lunar Resources for Artemis, " NASA, Dec. 2020.
B. K. Muirhead and A. Karp, "Mars Sample Return Lander Mission Concepts, " in IEEE Aerospace Conference, Mar. 2019, pp. 1-9.
J. P. Grotzinger et al., "Mars Science Laboratory Mission and Science Investigation, " Space Science Reviews, vol. 170, pp. 5-56, Jul. 2012.
K. Nickels, M. DiCicco, M. Bajracharya, and P. Backes, "Vision guided manipulation for planetary robotics-position control, " Robotics and Autonomous Systems, vol. 58, pp. 121-129, Jan. 2010.
M. J. Schuster et al., "The LRU Rover for Autonomous Planetary Exploration and its Success in the SpaceBotCamp Challenge, " in International Conference on Autonomous Robot Systems and Competitions, May 2016, pp. 7-14.
P. Lehner et al., "Mobile manipulation for planetary exploration, " in IEEE Aerospace Conference, Mar. 2018, pp. 1-11.
A. Sahbani, S. El-Khoury, and P. Bidaud, "An overview of 3D object grasp synthesis algorithms, " Robotics and Autonomous Systems, vol. 60, no. 3, pp. 326-336, Mar. 2012.
J. Mahler et al., "Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics, " arXiv preprint, Mar. 2017.
X. Xu, Y. Chen, and C. Bai, "Deep Reinforcement Learning-Based Accurate Control of Planetary Soft Landing, " Sensors, vol. 21, no. 23, Dec. 2021.
X. Jin, W. Lan, T. Wang, and P. Yu, "Value Iteration Networks with Double Estimator for Planetary Rover Path Planning, " Sensors, vol. 21, no. 24, Dec. 2021.
O. Kroemer, S. Niekum, and G. Konidaris, "A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms, " Journal of Machine Learning Research, vol. 22, no. 30, pp. 1395-1476, Jan. 2021.
I. Popov et al., "Data-efficient Deep Reinforcement Learning for Dexterous Manipulation, " arXiv preprint, Apr. 2017.
J. Tobin et al., "Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World, " in IEEE/RSJ International Conference on Intelligent Robots and Systems, Sep. 2017.
A. Zeng et al., "Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning, " in IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct. 2018, pp. 4238-4245.
M. Gualtieri, A. T. Pas, and R. Platt, "Pick and Place Without Geometric Object Models, " in IEEE International Conference on Robotics and Automation, May 2018, pp. 7433-7440.
D. Kalashnikov et al., "QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation, " arXiv preprint, Jul. 2018.
S. Levine, C. Finn, T. Darrell, and P. Abbeel, "End-To-End Training of Deep Visuomotor Policies, " Journal of Machine Learning Research, vol. 17, no. 39, pp. 1-40, Jan. 2016.
P.-S. Wang, Y. Liu, Y.-X. Guo, C. Sun, and X. Tong, "O-CNN: Octreebased Convolutional Neural Networks for 3D Shape Analysis, " ACM Transactions on Graphics, vol. 36, no. 72, pp. 1-11, Aug. 2017.
G. Riegler, A. O. Ulusoy, and A. Geiger, "OctNet: Learning Deep 3D Representations at High Resolutions, " IEEE Conference on Computer Vision and Pattern Recognition, pp. 6620-6629, Jul. 2017.
B. Trasnea et al., "OctoPath: An OcTree-Based Self-Supervised Learning Approach to Local Trajectory Planning for Mobile Robots, " Sensors, vol. 21, no. 11, May 2021.
R. Grimm, M. Grotz, S. Ottenhaus, and T. Asfour, "Vision-Based Robotic Pushing and Grasping for Stone Sample Collection under Computing Resource Constraints, " in IEEE International Conference on Robotics and Automation, May 2021, pp. 6498-6504.
M. Wermelinger, R. L. Johns, F. Gramazio, M. Kohler, and M. Hutter, "Grasping and Object Reorientation for Autonomous Construction of Stone Structures, " IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 5105-5112, Jul. 2021.
A. Kuznetsov, P. Shvechikov, A. Grishin, and D. Vetrov, "Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics, " in International Conference on Machine Learning, vol. 119, no. 515, Jul. 2020, pp. 5556-5566.
T. Haarnoja et al., "Soft Actor-Critic Algorithms and Applications, " arXiv preprint, Dec. 2018.
A. Raffin et al., "Stable-Baselines3: Reliable Reinforcement Learning Implementations, " Journal of Machine Learning Research, vol. 22, no. 268, pp. 1-8, Nov. 2021.
Y. Zhou et al., "On the Continuity of Rotation Representations in Neural Networks, " in IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp. 5738-5746.
V. Mnih et al., "Human-level control through deep reinforcement learning, " Nature, vol. 518, no. 7540, pp. 529-533, Feb. 2015.
I. A. Sucan and S. Chitta, MoveIt 2. [Online]. Available: https://moveit.ros.org
P. Beeson and B. Ames, "TRAC-IK: An open-source library for improved solving of generic inverse kinematics, " in IEEE-RAS International Conference on Humanoid Robots, Nov. 2015, pp. 928-935.
J. Kuffner and S. M. LaValle, "RRT-Connect: An Efficient Approach to Single-Query Path Planning, " in IEEE International Conference on Robotics and Automation, vol. 2, Apr. 2000, pp. 995-1001.
G. Brockman et al., "OpenAI Gym, " arXiv preprint, 2016.
Open Robotics, Gazebo. [Online]. Available: https://gazebosim.org
D. Ferigo, S. Traversaro, G. Metta, and D. Pucci, "Gym-Ignition: Reproducible Robotic Simulations for Reinforcement Learning, " in IEEE/SICE International Symposium on System Integration, Jan. 2020, pp. 885-890.
S. Macenski, T. Foote, B. Gerkey, C. Lalancette, and W. Woodall, "Robot Operating System 2: Design, architecture, and uses in the wild, " Science Robotics, vol. 7, no. 66, May 2022.
Blender Development Team, Blender 3.0. [Online]. Available: https://blender.org
T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, "Optuna: A Next-generation Hyperparameter Optimization Framework, " in ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Jul. 2019, pp. 2623-2631.
D. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization, " International Conference on Learning Representations, Dec. 2014.
P. Ludivig, A. Calzada-Diaz, M. Olivares-Mendez, H. Voos, and J. Lamamy, "Building a Piece of the Moon: Construction of Two Indoor Lunar Analogue Environments, " in International Astronautical Congress, Oct. 2020.
B. Mehta, M. Diaz, F. Golemo, C. J. Pal, and L. Paull, "Active Domain Randomization, " in Conference on Robot Learning, vol. 100, Oct. 2020, pp. 1162-1176.