Reference : Robust Exploration/Exploitation Trade-Offs in Safety-Critical Applications
Scientific congresses, symposiums and conference proceedings : Paper published in a book
Engineering, computing & technology : Computer science
Engineering, computing & technology : Electrical & electronics engineering
http://hdl.handle.net/10993/12682
Robust Exploration/Exploitation Trade-Offs in Safety-Critical Applications
English
Tokic, M. [> >]
Ertle, P. [> >]
Palm, G. [University of Ulm]
Soeffker, D. [University of Duisburg-Essen]
Voos, Holger mailto [University of Luxembourg > Faculty of Science, Technology and Communication (FSTC) > Engineering Research Unit > ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT)]
2012
8th IFAC Int. Symposium on Fault Detection, Supervision and Safety for Technical Processes, Mexico City 29-31 August 2012
Yes
No
International
8th IFAC Int. Symposium on Fault Detection,Supervision and Safety for Technical Processes
29-31 August 2012
Mexico City
Mexico
[en] Robotics ; Safety ; Learning
[en] With regard to future service robots, unsafe exceptional circumstances can occur in complex
systems that are hardly to foresee. In this paper, the assumption of having no knowledge about
the environment is investigated using reinforcement learning as an option for learning behavior
by trial-and-error. In such a scenario, action-selection decisions are made based on future reward predictions for minimizing costs in reaching a goal. It is shown that the selection of safetycritical actions leading to highly negative costs from the environment is directly related to the exploration/exploitation dilemma in temporal-di erence learning. For this, several exploration
policies are investigated with regard to worst- and best-case performance in a dynamic
environment. Our results show that in contrast to established exploration policies like epsilon-Greedy and Softmax, the recently proposed VDBE-Softmax policy seems to be more appropriate for such applications due to its robustness of the exploration parameter for unexpected situations.
Researchers ; Professionals ; Students
http://hdl.handle.net/10993/12682

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Limited access
SAFEPROCESS-2012.pdfAuthor postprint363.43 kBRequest a copy

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.