Model-based Bayesian reinforcement learning in partially observable domains

Poupart, P.; VLASSIS, Nikos

Download

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

Model-based Bayesian reinforcement learning in partially observable domains

Poupart, P.; VLASSIS, Nikos

2008 • In Proc Int. Symp. on Artificial Intelligence and Mathematics,

Peer reviewed

Permalink
https://hdl.handle.net/10993/11030

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

download.pdf

Author postprint (156.76 kB)

Download

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Disciplines :

Biochemistry, biophysics & molecular biology
Computer science

Identifiers :

UNILU:UL-ARTICLE-2011-751

Author, co-author :

Poupart, P.

VLASSIS, Nikos ; University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB)

Language :

English

Title :

Model-based Bayesian reinforcement learning in partially observable domains

Publication date :

2008

Event name :

Int. Symp. on Artificial Intelligence and Mathematics

Event date :

2008

Main work title :

Proc Int. Symp. on Artificial Intelligence and Mathematics,

Pages :

1-2

Peer reviewed :

Peer reviewed

Available on ORBilu :

since 17 November 2013

Statistics

Number of views

148 (0 by Unilu)

Number of downloads

1406 (0 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

Aberdeen, D., and Baxter, J. 2002. Scaling internalstate policy-gradient methods for POMDPs. In ICML, 3-10.
Castro, P. S., and Precup, D. 2007. Using linear programming for Bayesian exploration in Markov decision processes. In IJCAI, 2437-2442.
Dearden, R.; Friedman, N.; and Andre, D. 1999. Model based Bayesian exploration. In UAI, 150-159.
Duff, M. 2002. Optimal Learning: Computational procedures for Bayes-adaptive Markov decision processes. Ph.D. Dissertation, University of Massassachusetts Amherst.
Heckerman, D. 1999. A tutorial on learning with bayesian networks. In Jordan, M., ed., Learning in Graphical Models. Cambridge, MA: MIT Press.
Hohl, N., and Stone, P. 2004. Policy gradient reinforcement learning for fast quadrupedal locomotion. In ICRA.
Jaulmes, R.; Pineau, J.; and Precup, D. 2005. Active learning in partially observable Markov decision processes. In ECML, 601-608.
Kearns, M.; Mansour, Y.; and Ng, A. 2002. A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Machine Learning 49:193-208. (Pubitemid 34325686)
Meuleau, N.; Peshkin, L.; Kim, K.-E.; and Kaelbling, L. P. 1999. Learning finite-state controllers for partially observable environments. In UAI, 427-436.
Ng, A. Y., and Jordan, M. I. 2000. PEGASUS: a policy search method for large MDPs and POMDPs. In UAI, 406-415.
Ng, A.; Kim, H. J.; Jordan, M.; and Sastry, S. 2003. Autonomous helicopter flight via reinforcement learning. In NIPS.
Ng, A.; Parr, R.; and Koller, D. 2000. Policy search via density estimation. In NIPS, 1022-1028.
Porta, J. M.; Vlassis, N. A.; Spaaan, M. T. J.; and Poupart, P. 2006. Point-based value iteration for continuous pomdps. Journal of Machine Learning Research 7:2329-2367. (Pubitemid 44708007)
Poupart, P.; Vlassis, N.; Hoey, J.; and Regan, K. 2006. An analytic solution to discrete Bayesian reinforcement learning. In ICML, 697-704.
Smallwood, R. D., and Sondik, E. J. 1973. The optimal control of partially observable Markov processes over a finite horizon. Operations Research 21:1071-1088.
Strens, M. 2000. A Bayesian framework for reinforcement learning. In ICML, 943-950.
Tesauro, G. J. 1995. Temporal difference learning and TD-Gammon. Communications of the ACM 38:58-68.
Wang, T.; Lizotte, D.; Bowling, M.; and Schuurmans, D. 2005. Bayesian sparse sampling for on-line reward optimization. In ICML, 956-963.