A. Rupam Mahmood
Titolo
Citata da
Citata da
Anno
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
RS Sutton, AR Mahmood, M White
Journal of Machine Learning Research 17, 2016
1442016
Weighted importance sampling for off-policy learning with linear function approximation
AR Mahmood, H van Hasselt, RS Sutton
Advances in Neural Information Processing Systems 27, 2014
972014
True Online Temporal-Difference Learning
H van Seijen, AR Mahmood, PM Pilarski, MC Machado, RS Sutton
Journal of Machine Learning Research 17, 2016
722016
Benchmarking Reinforcement Learning Algorithms on Real-World Robots
AR Mahmood, D Korenkevych, G Vasan, W Ma, J Bergstra
Proceedings of the 2nd Annual Conference on Robot Learning (CoRL), 2018
632018
Tuning-free step-size adaptation
AR Mahmood, RS Sutton, T Degris, PM Pilarski
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE Internationalá…, 2012
422012
Off-policy TD (λ) with a true online equivalence
H van Hasselt, AR Mahmood, RS Sutton
Proceedings of the 30th Conference on Uncertainty in Artificial Intelligenceá…, 2014
362014
A new Q (λ) with interim forward view and Monte Carlo equivalence
RS Sutton, AR Mahmood, D Precup, M CA, H van Hasselt, U CA
352014
Multi-step Off-policy Learning Without Importance Sampling Ratios
AR Mahmood, H Yu, RS Sutton
arXiv preprint arXiv:1702.03006, 2017
292017
Setting up a reinforcement learning task with a real-world robot
AR Mahmood, D Korenkevych, BJ Komer, J Bergstra
2018 IEEE/RSJ International Conference on Intelligent Robots and Systemsá…, 2018
272018
Off-policy learning based on weighted importance sampling with linear computational complexity
AR Mahmood, RS Sutton
Proceedings of the 31st Conference on Uncertainty in Artificial Intelligenceá…, 2015
262015
Emphatic temporal-difference learning
AR Mahmood, H Yu, M White, RS Sutton
arXiv preprint arXiv:1507.01569, 2015
242015
Representation Search through Generate and Test
AR Mahmood, RS Sutton
Workshops at the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013
232013
On generalized bellman equations and temporal-difference learning
H Yu, AR Mahmood, RS Sutton
The Journal of Machine Learning Research 19 (1), 1864-1912, 2018
172018
Autoregressive Policies for Continuous Control Deep Reinforcement Learning
D Korenkevych, AR Mahmood, G Vasan, J Bergstra
Proceedings of the 28th International Joint Conference on Artificialá…, 2019
122019
Incremental Off-policy Reinforcement Learning Algorithms
A Mahmood
University of Alberta, 2017
122017
Structure Learning of Causal Bayesian Networks: A Survey
A Mahmood
Department of Computing Science, University of Alberta, Edmonton, Canadaá…, 2011
72011
Automatic step-size adaptation in incremental supervised learning
A Mahmood
University of Alberta, 2010
72010
Heteroscedastic Uncertainty for Robust Generative Latent Dynamics
O Limoyo, B Chan, F Marić, B Wagstaff, AR Mahmood, J Kelly
IEEE Robotics and Automation Letters 5 (4), 6654-6661, 2020
12020
An Empirical Evaluation of True Online TD (λ)
H van Seijen, AR Mahmood, PM Pilarski, RS Sutton
arXiv preprint arXiv:1507.00353, 2015
12015
Model-free Policy Learning with Reward Gradients
Q Lan, AR Mahmood
arXiv preprint arXiv:2103.05147, 2021
2021
Il sistema al momento non pu˛ eseguire l'operazione. Riprova pi¨ tardi.
Articoli 1–20