Segui
Konstantin Mishchenko
Konstantin Mishchenko
Meta
Email verificata su meta.com - Home page
Titolo
Citata da
Citata da
Anno
Tighter Theory for Local SGD on Identical and Heterogeneous Data
A Khaled, K Mishchenko, P Richtarik
International Conference on Artificial Intelligence and Statistics, 4519-4529, 2020
4952020
Distributed learning with compressed gradient differences
K Mishchenko, E Gorbunov, M Takáč, P Richtárik
Optimization Methods and Software, 1-16, 2019
234*2019
Stochastic distributed learning with gradient quantization and double-variance reduction
S Horváth, D Kovalev, K Mishchenko, P Richtárik, S Stich
Optimization Methods and Software, 1-16, 2022
1912022
First Analysis of Local GD on Heterogeneous Data
A Khaled, K Mishchenko, P Richtárik
NeurIPS FL Workshop, arXiv preprint arXiv:1909.04715, 2019
1832019
Random Reshuffling: Simple Analysis with Vast Improvements
K Mishchenko, A Khaled, P Richtárik
Advances in Neural Information Processing Systems 33, 17309-17320, 2020
1502020
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!
K Mishchenko, G Malinovsky, S Stich, P Richtárik
International Conference on Machine Learning, 15750-15769, 2022
1452022
Adaptive gradient descent without descent
Y Malitsky, K Mishchenko
International Conference on Machine Learning, 6702-6712, 2020
1112020
Revisiting stochastic extragradient
K Mishchenko, D Kovalev, E Shulgin, P Richtárik, Y Malitsky
International Conference on Artificial Intelligence and Statistics, 4573-4582, 2020
902020
SEGA: Variance Reduction via Gradient Sketching
F Hanzely, K Mishchenko, P Richtárik
Advances in Neural Information Processing Systems, 2082-2093, 2018
872018
Learning-Rate-Free Learning by D-Adaptation
A Defazio, K Mishchenko
International Conference on Machine Learning, 2023
742023
Asynchronous SGD Beats Minibatch SGD under Arbitrary Delays
K Mishchenko, F Bach, M Even, B Woodworth
Advances in Neural Information Processing Systems 35, 420-433, 2022
572022
Stochastic Newton and cubic Newton methods with simple local linear-quadratic rates
D Kovalev, K Mishchenko, P Richtárik
NeurIPS Workshop Beyond First Order Methods in ML, arXiv preprint arXiv:1912 …, 2019
502019
A Delay-tolerant Proximal-Gradient Algorithm for Distributed Learning
K Mishchenko, F Iutzeler, J Malick, MR Amini
International Conference on Machine Learning, 3584-3592, 2018
492018
Regularized Newton Method with Global Convergence
K Mishchenko
SIAM Journal on Optimization 33 (3), 1440-1462, 2023
482023
Proximal and Federated Random Reshuffling
K Mishchenko, A Khaled, P Richtárik
International Conference on Machine Learning, 15718-15749, 2022
432022
Dualize, split, randomize: Toward fast nonsmooth optimization algorithms
A Salim, L Condat, K Mishchenko, P Richtárik
Journal of Optimization Theory and Applications 195 (1), 102-130, 2022
412022
99% of worker-master communication in distributed optimization is not needed
K Mishchenko, F Hanzely, P Richtárik
Conference on Uncertainty in Artificial Intelligence, 979-988, 2020
34*2020
DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate
S Soori, K Mischenko, A Mokhtari, MM Dehnavi, M Gurbuzbalaban
AISTATS 2020, 2019
312019
A distributed flexible delay-tolerant proximal gradient algorithm
K Mishchenko, F Iutzeler, J Malick
SIAM Journal on Optimization 30 (1), 933-959, 2020
292020
Prodigy: An expeditiously adaptive parameter-free learner
K Mishchenko, A Defazio
International Conference on Machine Learning, 2024
282024
Il sistema al momento non puň eseguire l'operazione. Riprova piů tardi.
Articoli 1–20