Attention scheme inspired softmax regression Y Deng, Z Li, Z Song arXiv preprint arXiv:2304.10411, 2023 | 41 | 2023 |
Solving regularized exp, cosh and sinh regression problems Z Li, Z Song, T Zhou arXiv preprint arXiv:2303.15725, 2023 | 17 | 2023 |
An improved sample complexity for rank-1 matrix sensing Y Deng, Z Li, Z Song arXiv preprint arXiv:2303.06895, 2023 | 14 | 2023 |
Zero-th order algorithm for softmax attention optimization Y Deng, Z Li, S Mahadevan, Z Song arXiv preprint arXiv:2307.08352, 2023 | 10 | 2023 |
Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression Z Li, Z Song, Z Wang, J Yin arXiv preprint arXiv:2311.15390, 2023 | 1 | 2023 |