Shuxin Zheng
Shuxin Zheng
Email verificata su
Citata da
Citata da
Asynchronous stochastic gradient descent with delay compensation
S Zheng, Q Meng, T Wang, W Chen, N Yu, ZM Ma, TY Liu
Proceedings of the 34th International Conference on Machine Learning, PMLR …, 2017
On layer normalization in the transformer architecture
R Xiong, Y Yang, D He, K Zheng, S Zheng, C Xing, H Zhang, Y Lan, ...
Proceedings of the 37th International Conference on Machine Learning, 2020
-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space
Q Meng, S Zheng, H Zhang, W Chen, Q Ye, ZM Ma, TY Liu
Proceedings of the 7th International Conference on Learning Representations …, 2018
Capacity control of relu neural networks by basis-path norm
S Zheng, Q Meng, H Zhang, W Chen, N Yu, TY Liu
Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019, 2018
Invertible Image Rescaling
M Xiao, S Zheng, C Liu, Y Wang, D He, G Ke, J Bian, Z Lin, TY Liu
European Conference on Computer Vision (ECCV), 2020, 2020
Cross-iteration batch normalization
Z Yao, Y Cao, S Zheng, G Huang, S Lin
arXiv preprint arXiv:2002.05712, 2020
Deep learning for prediction of the air quality response to emission changes
J Xing, S Zheng, D Ding, JT Kelly, S Wang, S Li, T Qin, M Ma, Z Dong, ...
Environmental science & technology 54 (14), 8589-8600, 2020
OptQuant: Distributed Training of Neural Networks with Optimized Quantization Mechanisms
L He, S Zheng, W Chen, ZM Ma, TY Liu
Neurocomputing 340, Pages 233-244, 2019
Mc-bert: Efficient language pre-training via a meta controller
Z Xu, L Gong, G Ke, D He, S Zheng, L Wang, J Bian, TY Liu
arXiv preprint arXiv:2006.05744, 2020
Modeling Lost Information in Lossy Image Compression
Y Wang, M Xiao, C Liu, S Zheng, TY Liu
arXiv preprint arXiv:2006.11999, 2020
Supplementary material: On Layer Norm in the Transformer Architecture
R Xiong, Y Yang, D He, K Zheng, S Zheng, C Xing, H Zhang, Y Lan, ...
Il sistema al momento non può eseguire l'operazione. Riprova più tardi.
Articoli 1–11