Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications D Doshi, T He, A Gromov Advances in Neural Information Processing Systems 36, 2024 | 10* | 2024 |
Universal sharpness dynamics in neural network training: Fixed point analysis, edge of stability, and route to chaos DS Kalra, T He, M Barkeshli arXiv preprint arXiv:2311.02076, 2023 | 1 | 2023 |
AutoInit: Automatic Initialization via Jacobian Tuning T He, D Doshi, A Gromov arXiv preprint arXiv:2206.13568, 2022 | 1 | 2022 |
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos D Singh Kalra, T He, M Barkeshli arXiv e-prints, arXiv: 2311.02076, 2023 | | 2023 |
To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets D Doshi, A Das, T He, A Gromov arXiv preprint arXiv:2310.13061, 2023 | | 2023 |
Fracton models with crystalline symmetries in two dimensions T He, A Gromov APS March Meeting Abstracts 2021, L43. 004, 2021 | | 2021 |