PipeDream: Generalized pipeline parallelism for DNN training D Narayanan, A Harlap, A Phanishayee, V Seshadri, NR Devanur, ... Proceedings of the 27th ACM symposium on operating systems principles, 1-15, 2019 | 836 | 2019 |
Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology V Seshadri, D Lee, T Mullins, H Hassan, A Boroumand, J Kim, MA Kozuch, ... Proceedings of the 50th Annual IEEE/ACM International Symposium on …, 2017 | 632 | 2017 |
RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization V Seshadri, Y Kim, C Fallin, D Lee, R Ausavarungnirun, G Pekhimenko, ... Proceedings of the 46th Annual IEEE/ACM International Symposium on …, 2013 | 497 | 2013 |
Base-delta-immediate compression: Practical data compression for on-chip caches G Pekhimenko, V Seshadri, O Mutlu, PB Gibbons, MA Kozuch, TC Mowry Proceedings of the 21st international conference on Parallel architectures …, 2012 | 497 | 2012 |
A case for exploiting subarray-level parallelism (SALP) in DRAM Y Kim, V Seshadri, D Lee, J Liu, O Mutlu ACM SIGARCH Computer Architecture News 40 (3), 368-379, 2012 | 474 | 2012 |
Tiered-latency DRAM: A low latency and low cost DRAM architecture D Lee, Y Kim, V Seshadri, J Liu, L Subramanian, O Mutlu 2013 IEEE 19th International Symposium on High Performance Computer …, 2013 | 356 | 2013 |
Pipedream: Fast and efficient pipeline parallel dnn training A Harlap, D Narayanan, A Phanishayee, V Seshadri, N Devanur, ... arXiv preprint arXiv:1806.03377, 2018 | 265 | 2018 |
Adaptive-latency DRAM: Optimizing DRAM timing for the common-case D Lee, Y Kim, G Pekhimenko, S Khan, V Seshadri, K Chang, O Mutlu 2015 IEEE 21st International Symposium on High Performance Computer …, 2015 | 265 | 2015 |
Fast bulk bitwise AND and OR in DRAM V Seshadri, K Hsieh, A Boroum, D Lee, MA Kozuch, O Mutlu, PB Gibbons, ... IEEE Computer Architecture Letters 14 (2), 127-131, 2015 | 248 | 2015 |
The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory L Subramanian, V Seshadri, A Ghosh, S Khan, O Mutlu Proceedings of the 48th International Symposium on Microarchitecture, 62-75, 2015 | 226 | 2015 |
MISE: Providing performance predictability and improving fairness in shared main memory systems L Subramanian, V Seshadri, Y Kim, B Jaiyen, O Mutlu 2013 IEEE 19th International Symposium on High Performance Computer …, 2013 | 224 | 2013 |
Linearly compressed pages: A low-complexity, low-latency main memory compression framework G Pekhimenko, V Seshadri, Y Kim, H Xin, O Mutlu, PB Gibbons, ... Proceedings of the 46th Annual IEEE/ACM International Symposium on …, 2013 | 194 | 2013 |
ChargeCache: Reducing DRAM latency by exploiting row access locality H Hassan, G Pekhimenko, N Vijaykumar, V Seshadri, D Lee, O Ergin, ... 2016 IEEE International Symposium on High Performance Computer Architecture …, 2016 | 182 | 2016 |
Gather-scatter DRAM: In-DRAM address translation to improve the spatial locality of non-unit strided accesses V Seshadri, T Mullins, A Boroumand, O Mutlu, PB Gibbons, MA Kozuch, ... Proceedings of the 48th International Symposium on Microarchitecture, 267-280, 2015 | 168 | 2015 |
The evicted-address filter: A unified mechanism to address both cache pollution and thrashing V Seshadri, O Mutlu, MA Kozuch, TC Mowry Proceedings of the 21st international conference on Parallel architectures …, 2012 | 166 | 2012 |
Design-induced latency variation in modern DRAM chips: Characterization, analysis, and latency reduction mechanisms D Lee, S Khan, L Subramanian, S Ghose, R Ausavarungnirun, ... Proceedings of the ACM on Measurement and Analysis of Computing Systems 1 (1 …, 2017 | 154 | 2017 |
The blacklisting memory scheduler: Achieving high performance and fairness at low cost L Subramanian, D Lee, V Seshadri, H Rastogi, O Mutlu 2014 IEEE 32nd International Conference on Computer Design (ICCD), 8-15, 2014 | 132 | 2014 |
BLISS: Balancing performance, fairness and complexity in memory access scheduling L Subramanian, D Lee, V Seshadri, H Rastogi, O Mutlu IEEE Transactions on Parallel and Distributed Systems 27 (10), 3071-3087, 2016 | 107 | 2016 |
In-DRAM bulk bitwise execution engine V Seshadri, O Mutlu arXiv preprint arXiv:1905.09822, 2019 | 95 | 2019 |
Compiling KB-sized machine learning models to tiny IoT devices S Gopinath, N Ghanathe, V Seshadri, R Sharma Proceedings of the 40th ACM SIGPLAN Conference on Programming Language …, 2019 | 90 | 2019 |