Follow
Zhihao Bai
Zhihao Bai
Verified email at jhu.edu - Homepage
Title
Cited by
Cited by
Year
{DistCache}: Provable Load Balancing for {Large-Scale} Storage Systems with Distributed Caching
Z Liu, Z Bai, Z Liu, X Li, C Kim, V Braverman, X Jin, I Stoica
17th USENIX Conference on File and Storage Technologies (FAST 19), 143-157, 2019
1632019
{PipeSwitch}: Fast Pipelined Context Switching for Deep Learning Applications
Z Bai, Z Zhang, Y Zhu, X Jin
14th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2020
942020
Harmonia: Near-linear scalability for replicated storage with in-network conflict detection
H Zhu, Z Bai, J Li, E Michael, D Ports, I Stoica, X Jin
arXiv preprint arXiv:1904.08964, 2019
722019
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Z Jiang, H Lin, Y Zhong, Q Huang, Y Chen, Z Zhang, Y Peng, X Li, C Xie, ...
arXiv preprint arXiv:2402.15627, 2024
242024
Runtime recovery of web applications under zero-day redos attacks
Z Bai, K Wang, H Zhu, Y Cao, X Jin
2021 IEEE Symposium on Security and Privacy (SP), 1575-1588, 2021
192021
Transparent {GPU} Sharing in Container Clouds for Deep Learning Workloads
B Wu, Z Zhang, Z Bai, X Liu, X Jin
20th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2023
172023
: Efficient Resource Disaggregation for Deep Learning Workloads
X Jin, Z Bai, Z Zhang, Y Zhu, Y Zhong, X Liu
IEEE/ACM Transactions on Networking, 2024
22024
Runtime scheduling and updating for deep learning applications
Z Bai
Johns Hopkins University, 2022
2022
Scaling Large Language Model Training to More Than 10,000 GPUs
Z Jiang, H Lin, Y Zhong, Q Huang, Y Chen, Z Zhang, Y Peng, X Li, C Xie, ...
The system can't perform the operation now. Try again later.
Articles 1–9