Segui
Matteo Papini
Titolo
Citata da
Citata da
Anno
Stochastic variance-reduced policy gradient
M Papini, D Binaghi, G Canonaco, M Pirotta, M Restelli
Proceedings of the 35th International Conference on Machine Learning 80 …, 2018
1962018
Policy optimization via importance sampling
AM Metelli, M Papini, F Faccio, M Restelli
Advances in Neural Information Processing Systems 31, 2018
1112018
Feature selection via mutual information: New theoretical insights
M Beraha, AM Metelli, M Papini, A Tirinzoni, M Restelli
2019 international joint conference on neural networks (IJCNN), 1-9, 2019
1022019
Risk-averse trust region optimization for reward-volatility reduction
L Bisi, L Sabbioni, E Vittori, M Papini, M Restelli
arXiv preprint arXiv:1912.03193, 2019
692019
Importance sampling techniques for policy optimization
AM Metelli, M Papini, N Montali, M Restelli
Journal of Machine Learning Research 21 (141), 1-75, 2020
572020
Adaptive batch size for safe policy gradients
M Papini, M Pirotta, M Restelli
Advances in neural information processing systems 30, 2017
472017
Gradient-aware model-based policy search
P D'Oro, AM Metelli, A Tirinzoni, M Papini, M Restelli
Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3801-3808, 2020
452020
Smoothing policies and safe policy gradients
M Papini, M Pirotta, M Restelli
Machine Learning 111 (11), 4081-4137, 2022
402022
Optimistic policy optimization via multiple importance sampling
M Papini, AM Metelli, L Lupo, M Restelli
International Conference on Machine Learning, 4989-4999, 2019
402019
Leveraging good representations in linear contextual bandits
M Papini, A Tirinzoni, M Restelli, A Lazaric, M Pirotta
International Conference on Machine Learning, 8371-8380, 2021
282021
Reinforcement learning in linear mdps: Constant regret and representation selection
M Papini, A Tirinzoni, A Pacchiano, M Restelli, A Lazaric, M Pirotta
Advances in Neural Information Processing Systems 34, 16371-16383, 2021
192021
Balancing learning speed and stability in policy gradient via adaptive exploration
M Papini, A Battistello, M Restelli
International conference on artificial intelligence and statistics, 1188-1199, 2020
172020
Lifting the information ratio: An information-theoretic analysis of thompson sampling for contextual bandits
G Neu, I Olkhovskaia, M Papini, L Schwartz
Advances in Neural Information Processing Systems 35, 9486-9498, 2022
132022
Policy optimization as online learning with mediator feedback
AM Metelli, M Papini, P D'Oro, M Restelli
Proceedings of the AAAI Conference on Artificial Intelligence 35 (10), 8958-8966, 2021
132021
Offline primal-dual reinforcement learning for linear mdps
G Gabbianelli, G Neu, M Papini, NM Okolo
International Conference on Artificial Intelligence and Statistics, 3169-3177, 2024
82024
Importance-weighted offline learning done right
G Gabbianelli, G Neu, M Papini
International Conference on Algorithmic Learning Theory, 614-634, 2024
52024
Online adversarial mdps with off-policy feedback and known transitions
F Bacchiocchi, FE Stradi, M Papini, AM Metelli, N Gatti
Sixteenth European Workshop on Reinforcement Learning, 2023
52023
Online learning with off-policy feedback
G Gabbianelli, G Neu, M Papini
International Conference on Algorithmic Learning Theory, 620-641, 2023
42023
Scalable representation learning in linear contextual bandits with constant regret guarantees
A Tirinzoni, M Papini, A Touati, A Lazaric, M Pirotta
Advances in Neural Information Processing Systems 35, 2307-2319, 2022
32022
Automated Reasoning for Reinforcement Learning Agents in Structured Environments.
A Gianola, M Montali, M Papini
OVERLAY@ GandALF, 43-48, 2021
32021
Il sistema al momento non può eseguire l'operazione. Riprova più tardi.
Articoli 1–20