Mohammad Ghavamzadeh
Mohammad Ghavamzadeh
Google Research
Email verificata su google.com - Home page
Titolo
Citata da
Citata da
Anno
Natural actor–critic algorithms
S Bhatnagar, RS Sutton, M Ghavamzadeh, M Lee
Automatica 45 (11), 2471-2482, 2009
5672009
Bayesian reinforcement learning: A survey
M Ghavamzadeh, S Mannor, J Pineau, A Tamar
arXiv preprint arXiv:1609.04436, 2016
2122016
Best arm identification: A unified approach to fixed budget and fixed confidence
V Gabillon, M Ghavamzadeh, A Lazaric
Advances in Neural Information Processing Systems, 3212-3220, 2012
1902012
Regularized policy iteration
A Farahmand, M Ghavamzadeh, S Mannor, C Szepesvári
Advances in Neural Information Processing Systems 21, 441-448, 2008
1512008
Hierarchical multi-agent reinforcement learning
R Makar, S Mahadevan, M Ghavamzadeh
Proceedings of the fifth international conference on Autonomous agents, 246-253, 2001
1492001
High-confidence off-policy evaluation
PS Thomas, G Theocharous, M Ghavamzadeh
Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015
1472015
Hierarchical multi-agent reinforcement learning
M Ghavamzadeh, S Mahadevan, R Makar
Autonomous Agents and Multi-Agent Systems 13 (2), 197-229, 2006
1372006
Supervised actor-critic reinforcement learning
MT Rosenstein, AG Barto, J Si, A Barto, W Powell
Learning and Approximate Dynamic Programming: Scaling Up to the Real World …, 2004
1342004
A lyapunov-based approach to safe reinforcement learning
Y Chow, O Nachum, E Duenez-Guzman, M Ghavamzadeh
Advances in neural information processing systems, 8092-8101, 2018
1192018
High confidence policy improvement
P Thomas, G Theocharous, M Ghavamzadeh
International Conference on Machine Learning, 2380-2388, 2015
1112015
Finite-Sample Analysis of Proximal Gradient TD Algorithms.
B Liu, J Liu, M Ghavamzadeh, S Mahadevan, M Petrik
UAI, 504-513, 2015
1082015
Risk-constrained reinforcement learning with percentile risk criteria
Y Chow, M Ghavamzadeh, L Janson, M Pavone
The Journal of Machine Learning Research 18 (1), 6070-6120, 2017
962017
Speedy Q-learning
MG Azar, R Munos, M Ghavamzadaeh, HJ Kappen
Spain, Granada: NIPS, 2011
942011
Bayesian multi-task reinforcement learning
A Lazaric, M Ghavamzadeh
892010
More robust doubly robust off-policy evaluation
M Farajtabar, Y Chow, M Ghavamzadeh
arXiv preprint arXiv:1802.03493, 2018
882018
Multi-bandit best arm identification
V Gabillon, M Ghavamzadeh, A Lazaric, S Bubeck
Advances in Neural Information Processing Systems 24, 2222-2230, 2011
882011
Finite-sample analysis of least-squares policy iteration
A Lazaric, M Ghavamzadeh, R Munos
The Journal of Machine Learning Research 13 (1), 3041-3074, 2012
862012
Bayesian policy gradient algorithms
M Ghavamzadeh, Y Engel
Advances in neural information processing systems 19, 457-464, 2006
862006
Analysis of a classification-based policy iteration algorithm
A Lazaric, M Ghavamzadeh, R Munos
832010
Regularized fitted Q-iteration for planning in continuous-space Markovian decision problems
A massoud Farahmand, M Ghavamzadeh, C Szepesvári, S Mannor
2009 American Control Conference, 725-730, 2009
802009
Il sistema al momento non può eseguire l'operazione. Riprova più tardi.
Articoli 1–20