Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ... arXiv preprint arXiv:2201.11990, 2022 | 293* | 2022 |
SPLATT: Efficient and parallel sparse tensor-matrix multiplication S Smith, N Ravindran, ND Sidiropoulos, G Karypis 2015 IEEE International Parallel and Distributed Processing Symposium, 61-70, 2015 | 233 | 2015 |
Bloom: A 176b-parameter open-access multilingual language model TL Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... arXiv preprint arXiv:2211.05100, 2022 | 211 | 2022 |
FROSTT: The Formidable Repository of Open Sparse Tensors and Tools S Smith, JW Choi, J Li, R Vuduc, J Park, X Liu, G Karypis http://frostt.io/, 2017 | 125 | 2017 |
Tensor-Matrix Products with a Compressed Sparse Tensor S Smith, G Karypis 5th Workshop on Irregular applications: Architectures and Algorithms (IA^3), 2015 | 122 | 2015 |
Zero-infinity: Breaking the gpu memory wall for extreme scale deep learning S Rajbhandari, O Ruwase, J Rasley, S Smith, Y He Proceedings of the International Conference for High Performance Computing …, 2021 | 108 | 2021 |
A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization S Smith, G Karypis Parallel and Distributed Processing Symposium (IPDPS), 2016 IEEE International, 2016 | 95* | 2016 |
Tensaurus: A versatile accelerator for mixed sparse-dense tensor computations N Srivastava, H Jin, S Smith, H Rong, D Albonesi, Z Zhang 2020 IEEE International Symposium on High Performance Computer Architecture …, 2020 | 71 | 2020 |
Bridging the gap between HPC and big data frameworks M Anderson, S Smith, N Sundaram, M Capotă, Z Zhao, S Dulloor, ... Proceedings of the VLDB Endowment 10 (8), 901-912, 2017 | 66 | 2017 |
Accelerating the tucker decomposition with compressed sparse tensors S Smith, G Karypis Euro-Par 2017: Parallel Processing: 23rd International Conference on …, 2017 | 60 | 2017 |
Truss Decomposition on Shared-Memory Parallel Systems S Smith, X Liu, NK Ahmed, AS Tom, F Petrini, G Karypis IEEE High Performance Extreme Computing Conference (HPEC), 2017 | 56 | 2017 |
Streaming tensor factorization for infinite data sources S Smith, K Huang, ND Sidiropoulos, G Karypis Proceedings of the 2018 SIAM International Conference on Data Mining, 81-89, 2018 | 48 | 2018 |
Sparse tensor factorization on many-core processors with high-bandwidth memory S Smith, J Park, G Karypis 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017 | 42 | 2017 |
Big data frequent pattern mining DC Anastasiu, J Iverson, S Smith, G Karypis Frequent Pattern Mining, 225-259, 2014 | 42 | 2014 |
An Exploration of Optimization Algorithms for High Performance Tensor Completion S Smith, J Park, G Karypis Proceedings of the 2016 ACM/IEEE Conference on Supercomputing (SC '16), 2016 | 36 | 2016 |
Memory-efficient parallel computation of tensor and matrix products for big tensor decomposition N Ravindran, ND Sidiropoulos, S Smith, G Karypis 2014 48th Asilomar Conference on Signals, Systems and Computers, 581-585, 2014 | 35 | 2014 |
Exploring Optimizations on Shared-memory Platforms for Parallel Triangle Counting Algorithms AS Tom, N Sundaram, NK Ahmed, S Smith, S Eyerman, M Kodiyath, I Hur, ... IEEE High Performance Extreme Computing Conference (HPEC), 2017 | 31 | 2017 |
Constrained Tensor Factorization with Accelerated AO-ADMM S Smith, A Beri, G Karypis 46th International Conference on Parallel Processing (ICPP '17), 2017 | 30 | 2017 |
Blocking optimization techniques for sparse tensor computation J Choi, X Liu, S Smith, T Simon 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2018 | 26 | 2018 |
SPLATT: The Surprisingly ParalleL spArse Tensor Toolkit S Smith, G Karypis http://cs.umn.edu/~splatt/, 2015 | 25 | 2015 |