Follow
Sunil Shukla
Sunil Shukla
IBM Research
Verified email at us.ibm.com - Homepage
Title
Cited by
Cited by
Year
FPGA programming for the masses
DF Bacon, R Rabbah, S Shukla
Communications of the ACM 56 (4), 56-63, 2013
2552013
A scalable multi-TeraOPS deep learning processor core for AI trainina and inference
B Fleischer, S Shukla, M Ziegler, J Silberman, J Oh, V Srinivasan, J Choi, ...
2018 IEEE symposium on VLSI circuits, 35-36, 2018
1452018
Approximate computing: Challenges and opportunities
A Agrawal, J Choi, K Gopalakrishnan, S Gupta, R Nair, J Oh, DA Prener, ...
2016 IEEE International Conference on Rebooting Computing (ICRC), 1-8, 2016
1142016
Single bit error correction implementation in CRC-16 on FPGA
S Shukla, NW Bergmann
Proceedings. 2004 IEEE International Conference on Field-Programmable …, 2004
842004
A compiler and runtime for heterogeneous computing
J Auerbach, DF Bacon, I Burcea, P Cheng, SJ Fink, R Rabbah, S Shukla
Proceedings of the 49th Annual Design Automation Conference, 271-276, 2012
802012
9.1 A 7nm 4-core AI chip with 25.6 TFLOPS hybrid FP8 training, 102.4 TOPS INT4 inference and workload-aware throttling
A Agrawal, SK Lee, J Silberman, M Ziegler, M Kang, S Venkataramani, ...
2021 IEEE International Solid-State Circuits Conference (ISSCC) 64, 144-146, 2021
692021
RaPiD: AI accelerator for ultra-low precision training and inference
S Venkataramani, V Srinivasan, W Wang, S Sen, J Zhang, A Agrawal, ...
2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021
642021
QUKU: a two-level reconfigurable architecture
S Shukla, NW Bergmann, J Becker
IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and …, 2006
602006
Efficient AI system design with cross-layer approximate computing
S Venkataramani, X Sun, N Wang, CY Chen, J Choi, M Kang, A Agarwal, ...
Proceedings of the IEEE 108 (12), 2232-2250, 2020
462020
A 3.0 TFLOPS 0.62 V scalable processor core for high compute utilization AI training and inference
J Oh, SK Lee, M Kang, M Ziegler, J Silberman, A Agrawal, ...
2020 IEEE Symposium on VLSI Circuits, 1-2, 2020
382020
Tightly coupled processor arrays using coarse grained reconfigurable architecture with iteration level commits
CY Chen, K Gopalakrishnan, J Oh, SK Shukla, V Srinivasan
US Patent 10,120,685, 2018
362018
A scalable multi-TeraOPS core for AI training and inference
S Shukla, B Fleischer, M Ziegler, J Silberman, J Oh, V Srinivasan, J Choi, ...
IEEE Solid-State Circuits Letters 1 (12), 217-220, 2018
332018
Tightly coupled processor arrays using coarse grained reconfigurable architecture with iteration level commits
CY Chen, K Gopalakrishnan, J Oh, LM Saltzman, SK Shukla, V Srinivasan
US Patent 10,528,356, 2020
312020
QUKU: a dual-layer reconfigurable architecture
NW Bergmann, SK Shukla, J Becker
ACM Transactions on Embedded Computing Systems (TECS) 12 (1s), 1-26, 2013
302013
QUKU: A FPGA based flexible coarse grain architecture design paradigm using process networks
S Shukla, NW Bergmann, J Becker
2007 IEEE International Parallel and Distributed Processing Symposium, 1-7, 2007
282007
QUKU: A fast run time reconfigurable platform for image edge detection
S Shukla, NW Bergmann, J Becker
International Workshop on Applied Reconfigurable Computing, 93-98, 2006
182006
QUKU: A coarse grained paradigm for FPGAs
S Shukla, NW Bergmann, J Becker
Schloss-Dagstuhl-Leibniz Zentrum für Informatik, 2006
182006
A 7-nm four-core mixed-precision AI chip with 26.2-TFLOPS hybrid-FP8 training, 104.9-TOPS INT4 inference, and workload-aware throttling
SK Lee, A Agrawal, J Silberman, M Ziegler, M Kang, S Venkataramani, ...
IEEE Journal of Solid-State Circuits 57 (1), 182-197, 2021
152021
And then there were none: A stall-free real-time garbage collector for reconfigurable hardware
DF Bacon, P Cheng, S Shukla
ACM SIGPLAN Notices 47 (6), 23-34, 2012
152012
Cycle-accurate replay and debugging of running FPGA systems
D Foisy, SK Shukla
US Patent 9,217,774, 2015
112015
The system can't perform the operation now. Try again later.
Articles 1–20