Learning to Grasp on the Moon from 3D Octree Observations with Deep Reinforcement Learning

Orsula, Andrej; Bøgh, Simon; Olivares Mendez, Miguel Angel; Martinez Luna, Carol

doi:10.1109/IROS47612.2022.9981661

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

Learning to Grasp on the Moon from 3D Octree Observations with Deep Reinforcement Learning

Orsula, Andrej; Bøgh, Simon; Olivares Mendez, Miguel Angel et al.

2022 • In Proceedings of 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Peer reviewed

Permalink
https://hdl.handle.net/10993/51908

DOI
10.1109/IROS47612.2022.9981661

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

IROS22_1460.pdf

Author postprint (6.24 MB)

Final submission

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Space Robotics and Automation; Reinforcement Learning; Deep Learning in Grasping and Manipulation

Abstract :

[en] Extraterrestrial rovers with a general-purpose robotic arm have many potential applications in lunar and planetary exploration. Introducing autonomy into such systems is desirable for increasing the time that rovers can spend gathering scientific data and collecting samples. This work investigates the applicability of deep reinforcement learning for vision-based robotic grasping of objects on the Moon. A novel simulation environment with procedurally-generated datasets is created to train agents under challenging conditions in unstructured scenes with uneven terrain and harsh illumination. A model-free off-policy actor-critic algorithm is then employed for end-to-end learning of a policy that directly maps compact octree observations to continuous actions in Cartesian space. Experimental evaluation indicates that 3D data representations enable more effective learning of manipulation skills when compared to traditionally used image-based observations. Domain randomization improves the generalization of learned policies to novel scenes with previously unseen objects and different illumination conditions. To this end, we demonstrate zero-shot sim-to-real transfer by evaluating trained agents on a real robot in a Moon-analogue facility.

Disciplines :

Computer science
Aerospace & aeronautics engineering

Author, co-author :

Orsula, Andrej ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Space Robotics

Bøgh, Simon; Aalborg University > Department of Materials and Production > Robotics and Automation

Olivares Mendez, Miguel Angel ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Space Robotics

Martinez Luna, Carol ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > Space Robotics

External co-authors :

yes

Language :

English

Title :

Learning to Grasp on the Moon from 3D Octree Observations with Deep Reinforcement Learning

Publication date :

23 October 2022

Event name :

2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Event place :

Kyoto, Japan

Event date :

23/10/2022 → 27/10/2022

Audience :

International

Main work title :

Proceedings of 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Peer reviewed :

Peer reviewed

Additional URL :

https://github.com/AndrejOrsula/drl_grasping
https://youtube.com/watch?v=FZSoOkK6VFc

Available on ORBilu :

since 22 August 2022

Statistics

Number of views

193 (58 by Unilu)

Number of downloads

52 (9 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

T. Zhang et al., "The progress of extraterrestrial regolith-sampling robots, " Nature Astronomy, vol. 3, pp. 487-497, Jun. 2019.
R. W. Moses and D. M. Bushnell, Frontier In-Situ Resource Utilization for Enabling Sustained Human Presence on Mars, ser. NASA technical memorandum. NASA, Langley Research Center, Apr. 2016.
J. Liu et al., "Landing Site Selection and Overview of China's Lunar Landing Missions, " Space Science Reviews, vol. 217, no. 6, Feb. 2021.
S. Schierholz and J. Finch, "NASA Selects Companies to Collect Lunar Resources for Artemis, " NASA, Dec. 2020.
B. K. Muirhead and A. Karp, "Mars Sample Return Lander Mission Concepts, " in IEEE Aerospace Conference, Mar. 2019, pp. 1-9.
J. P. Grotzinger et al., "Mars Science Laboratory Mission and Science Investigation, " Space Science Reviews, vol. 170, pp. 5-56, Jul. 2012.
K. Nickels, M. DiCicco, M. Bajracharya, and P. Backes, "Vision guided manipulation for planetary robotics-position control, " Robotics and Autonomous Systems, vol. 58, pp. 121-129, Jan. 2010.
M. J. Schuster et al., "The LRU Rover for Autonomous Planetary Exploration and its Success in the SpaceBotCamp Challenge, " in International Conference on Autonomous Robot Systems and Competitions, May 2016, pp. 7-14.
P. Lehner et al., "Mobile manipulation for planetary exploration, " in IEEE Aerospace Conference, Mar. 2018, pp. 1-11.
A. Sahbani, S. El-Khoury, and P. Bidaud, "An overview of 3D object grasp synthesis algorithms, " Robotics and Autonomous Systems, vol. 60, no. 3, pp. 326-336, Mar. 2012.
J. Mahler et al., "Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics, " arXiv preprint, Mar. 2017.
X. Xu, Y. Chen, and C. Bai, "Deep Reinforcement Learning-Based Accurate Control of Planetary Soft Landing, " Sensors, vol. 21, no. 23, Dec. 2021.
X. Jin, W. Lan, T. Wang, and P. Yu, "Value Iteration Networks with Double Estimator for Planetary Rover Path Planning, " Sensors, vol. 21, no. 24, Dec. 2021.
O. Kroemer, S. Niekum, and G. Konidaris, "A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms, " Journal of Machine Learning Research, vol. 22, no. 30, pp. 1395-1476, Jan. 2021.
I. Popov et al., "Data-efficient Deep Reinforcement Learning for Dexterous Manipulation, " arXiv preprint, Apr. 2017.
J. Tobin et al., "Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World, " in IEEE/RSJ International Conference on Intelligent Robots and Systems, Sep. 2017.
A. Zeng et al., "Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning, " in IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct. 2018, pp. 4238-4245.
M. Gualtieri, A. T. Pas, and R. Platt, "Pick and Place Without Geometric Object Models, " in IEEE International Conference on Robotics and Automation, May 2018, pp. 7433-7440.
D. Kalashnikov et al., "QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation, " arXiv preprint, Jul. 2018.
S. Levine, C. Finn, T. Darrell, and P. Abbeel, "End-To-End Training of Deep Visuomotor Policies, " Journal of Machine Learning Research, vol. 17, no. 39, pp. 1-40, Jan. 2016.
P.-S. Wang, Y. Liu, Y.-X. Guo, C. Sun, and X. Tong, "O-CNN: Octreebased Convolutional Neural Networks for 3D Shape Analysis, " ACM Transactions on Graphics, vol. 36, no. 72, pp. 1-11, Aug. 2017.
G. Riegler, A. O. Ulusoy, and A. Geiger, "OctNet: Learning Deep 3D Representations at High Resolutions, " IEEE Conference on Computer Vision and Pattern Recognition, pp. 6620-6629, Jul. 2017.
B. Trasnea et al., "OctoPath: An OcTree-Based Self-Supervised Learning Approach to Local Trajectory Planning for Mobile Robots, " Sensors, vol. 21, no. 11, May 2021.
R. Grimm, M. Grotz, S. Ottenhaus, and T. Asfour, "Vision-Based Robotic Pushing and Grasping for Stone Sample Collection under Computing Resource Constraints, " in IEEE International Conference on Robotics and Automation, May 2021, pp. 6498-6504.
M. Wermelinger, R. L. Johns, F. Gramazio, M. Kohler, and M. Hutter, "Grasping and Object Reorientation for Autonomous Construction of Stone Structures, " IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 5105-5112, Jul. 2021.
A. Kuznetsov, P. Shvechikov, A. Grishin, and D. Vetrov, "Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics, " in International Conference on Machine Learning, vol. 119, no. 515, Jul. 2020, pp. 5556-5566.
T. Haarnoja et al., "Soft Actor-Critic Algorithms and Applications, " arXiv preprint, Dec. 2018.
A. Raffin et al., "Stable-Baselines3: Reliable Reinforcement Learning Implementations, " Journal of Machine Learning Research, vol. 22, no. 268, pp. 1-8, Nov. 2021.
Y. Zhou et al., "On the Continuity of Rotation Representations in Neural Networks, " in IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp. 5738-5746.
V. Mnih et al., "Human-level control through deep reinforcement learning, " Nature, vol. 518, no. 7540, pp. 529-533, Feb. 2015.
I. A. Sucan and S. Chitta, MoveIt 2. [Online]. Available: https://moveit.ros.org
P. Beeson and B. Ames, "TRAC-IK: An open-source library for improved solving of generic inverse kinematics, " in IEEE-RAS International Conference on Humanoid Robots, Nov. 2015, pp. 928-935.
J. Kuffner and S. M. LaValle, "RRT-Connect: An Efficient Approach to Single-Query Path Planning, " in IEEE International Conference on Robotics and Automation, vol. 2, Apr. 2000, pp. 995-1001.
G. Brockman et al., "OpenAI Gym, " arXiv preprint, 2016.
Open Robotics, Gazebo. [Online]. Available: https://gazebosim.org
D. Ferigo, S. Traversaro, G. Metta, and D. Pucci, "Gym-Ignition: Reproducible Robotic Simulations for Reinforcement Learning, " in IEEE/SICE International Symposium on System Integration, Jan. 2020, pp. 885-890.
S. Macenski, T. Foote, B. Gerkey, C. Lalancette, and W. Woodall, "Robot Operating System 2: Design, architecture, and uses in the wild, " Science Robotics, vol. 7, no. 66, May 2022.
Blender Development Team, Blender 3.0. [Online]. Available: https://blender.org
T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, "Optuna: A Next-generation Hyperparameter Optimization Framework, " in ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Jul. 2019, pp. 2623-2631.
D. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization, " International Conference on Learning Representations, Dec. 2014.
P. Ludivig, A. Calzada-Diaz, M. Olivares-Mendez, H. Voos, and J. Lamamy, "Building a Piece of the Moon: Construction of Two Indoor Lunar Analogue Environments, " in International Astronautical Congress, Oct. 2020.
B. Mehta, M. Diaz, F. Golemo, C. J. Pal, and L. Paull, "Active Domain Randomization, " in Conference on Robot Learning, vol. 100, Oct. 2020, pp. 1162-1176.