Planning with Continuous Actions in Partially Observable Environments

Spaan, M. T. J.; VLASSIS, Nikos

No full text

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

Planning with Continuous Actions in Partially Observable Environments

Spaan, M. T. J.; VLASSIS, Nikos

2005 • In Proc. IEEE Int. Conf. on Robotics and Automation

Peer reviewed

Permalink
https://hdl.handle.net/10993/11056

Files (0)Send to Details Statistics Bibliography Similar publications

Files

Full Text

No document available.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Abstract :

[en] We present a simple randomized POMDP al gorithm for planning with continuous actions in partially observable environments. Our algorithm operates on a set of reachable belief points, sampled by letting the robot interact randomly with the environment. We perform value iteration steps, ensuring that in each step the value of all sampled belief points is improved. The idea here is that by sampling actions from a continuous action space we can quickly improve the value of all belief points in the set. We demonstrate the viability of our algorithm on two sets of experiments: one involving an active localization task and one concerning robot navigation in a perceptually aliased of fice environment.

Disciplines :

Computer science

Identifiers :

UNILU:UL-ARTICLE-2011-724

Author, co-author :

Spaan, M. T. J.

VLASSIS, Nikos ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)

Language :

English

Title :

Planning with Continuous Actions in Partially Observable Environments

Publication date :

2005

Event name :

IEEE Int. Conf. on Robotics and Automation

Event date :

2005

Main work title :

Proc. IEEE Int. Conf. on Robotics and Automation

Pages :

3458-3463

Peer reviewed :

Peer reviewed

Available on ORBilu :

since 17 November 2013

Statistics

Number of views

93 (0 by Unilu)

Number of downloads

0 (0 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

J. Latombe, Robot Motion Planning. Kluwer Academic Publishers, 1991.
L. P. Kaelbling, M. L. Litttnan, and A. R. Cassandra, "Planning and acting in partially observable stochastic domains," Artificial Intelligence, vol. 101, pp. 99-134, 1998.
C. H. Papadimitriou and J. N. Tsitsiklis, 'The complexity of Markov decision processes," Mathematics of operations research, vol. 12, no. 3, pp. 441-450, 1987.
O. Madani, S. Hanks, and A. Condon, "On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems," in Proc. 16th National Conf. on Artificial Intelligence, Orlando, Florida, July 1999.
W. S. Lovejoy, "Computationally feasible bounds for partially observed Markov decision processes," Operations Research, vol. 39, no. 1, pp. 162-175, 1991.
M. Hauskrecht, "Value function approximations for partially observable Markov decision processes," Journal of Artificial Intelligence Research, vol. 13, pp. 33-95, 2000.
K.-M. Poon, "A fast heuristic algorithm for decision-theoretic planning," Master's thesis, The Hong-Kong University of Science and Technology, 2001.
N. Roy and G. Gordon, "Exponential family PCA for belief compression in POMDPs," in Advances in Neural Information Processing Systems 15. Cambridge, MA: MIT Press, 2003.
J. Pineau, G. Gordon, and S. Thrun, "Point-based value iteration: An anytime algorithm for POMDPs," in Proc. Int. Joint Conf. on Artificial Intelligence, Acapulco, Mexico, Aug. 2003.
M. T. J. Spaan and N. Vlassis, "A point-based POMDP algorithm for robot planning," in Proceedings of the IEEE International Conference on Robotics and Automation, New Orleans, Louisiana, 2004, pp. 2399-2404.
_, "Perseus: randomized point-based value iteration for POMDPs," Informatics Institute, University of Amsterdam, Tech. Rep. IAS-UVA-04-02, Nov. 2004.
S. Thrun, "Monte Carlo POMDPs," in Advances in Neural Information Processing Systems 12, S. Solla, T. Leen, and K.-R. Müller, Eds. MIT Press, 2000, pp. 1064-1070.
A. Y. Ng and M. Jordan, "PEGASUS: A policy search method for large MDPs and POMDPs," in Proc. of Uncertainty in Artificial Intelligence, 2000.
D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming. Belmont, MA: Athena Scientific, 1996.
E. J. Sondik, 'The optimal control of partially observable Markov decision processes," Ph.D. dissertation, Stanford University, 1971.
N. L. Zhang and W. Zhang, "Speeding up the convergence of value iteration in partially observable Markov decision processes," Journal of Artificial Intelligence Research, vol. 14, pp. 29-51, 2001.