Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA G Bosilca, A Bouteiller, A Danalis, M Faverge, A Haidar, T Herault, ... 2011 IEEE International Symposium on Parallel and Distributed Processing …, 2011 | 148* | 2011 |

Seismic wave modeling for seismic imaging J Virieux, S Operto, H Ben-Hadj-Ali, R Brossier, V Etienne, F Sourbier, ... The Leading Edge 28 (5), 538-544, 2009 | 80 | 2009 |

Accelerating numerical dense linear algebra calculations with GPUs J Dongarra, M Gates, A Haidar, J Kurzak, P Luszczek, S Tomov, ... Numerical computations with GPUs, 3-28, 2014 | 64 | 2014 |

Parallel reduction to condensed forms for symmetric eigenvalue problems using aggregated fine-grained and memory-aware kernels A Haidar, H Ltaief, J Dongarra Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 64 | 2011 |

Performance, design, and autotuning of batched GEMM for GPUs A Abdelfattah, A Haidar, S Tomov, J Dongarra International Conference on High Performance Computing, 21-38, 2016 | 54 | 2016 |

Batched matrix computations on hardware accelerators based on GPUs A Haidar, T Dong, P Luszczek, S Tomov, J Dongarra The International Journal of High Performance Computing Applications 29 (2 …, 2015 | 51 | 2015 |

Image-based date fruit classification A Haidar, H Dong, N Mavridis 2012 IV International Congress on Ultra Modern Telecommunications and …, 2012 | 44 | 2012 |

Car parking vacancy detection and its application in 24-hour statistical analysis J Jermsurawong, MU Ahsan, A Haidar, H Dong, N Mavridis 2012 10th International Conference on Frontiers of Information Technology, 84-90, 2012 | 41 | 2012 |

Parallel scalability study of hybrid preconditioners in three dimensions L Giraud, A Haidar, LT Watson Parallel Computing 34 (6-8), 363-379, 2008 | 41 | 2008 |

Unified development for mixed multi-gpu and multi-coprocessor environments using a lightweight runtime environment A Haidar, C Cao, A Yarkhan, P Luszczek, S Tomov, K Kabir, J Dongarra 2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014 | 40* | 2014 |

LU factorization of small matrices: accelerating batched DGETRF on the GPU T Dong, A Haidar, P Luszczek, JA Harris, S Tomov, J Dongarra 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 …, 2014 | 39 | 2014 |

An improved parallel singular value algorithm and its implementation for multicore hardware A Haidar, J Kurzak, P Luszczek Proceedings of the International Conference on High Performance Computing …, 2013 | 37 | 2013 |

Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures A Haidar, H Ltaief, A YarKhan, J Dongarra Concurrency and Computation: Practice and Experience 24 (3), 305-321, 2012 | 37 | 2012 |

A framework for batched and GPU-resident factorization algorithms applied to block householder transformations A Haidar, TT Dong, S Tomov, P Luszczek, J Dongarra International Conference on High Performance Computing, 31-47, 2015 | 36 | 2015 |

Sparse approximations of the Schur complement for parallel algebraic hybrid linear solvers in 3D L Giraud, A Haidar, Y Saad | 36 | 2010 |

High-performance tensor contractions for GPUs A Abdelfattah, M Baboulin, V Dobrev, J Dongarra, C Earl, J Falcou, ... Procedia Computer Science 80, 108-118, 2016 | 34 | 2016 |

High-performance matrix-matrix multiplications of very small matrices I Masliah, A Abdelfattah, A Haidar, S Tomov, M Baboulin, J Falcou, ... European Conference on Parallel Processing, 659-671, 2016 | 31 | 2016 |

Hpc programming on intel many-integrated-core hardware with magma port to xeon phi J Dongarra, M Gates, A Haidar, Y Jia, K Kabir, P Luszczek, S Tomov Scientific Programming 2015, 9, 2015 | 31 | 2015 |

A fast batched Cholesky factorization on a GPU T Dong, A Haidar, S Tomov, J Dongarra 2014 43rd International Conference on Parallel Processing, 432-440, 2014 | 30 | 2014 |

Multithreading in the PLASMA Library J Kurzak, P Luszczek, A YarKhan, M Faverge, J Langou, H Bouwmeester, ... Multicore Computing: Algorithms, Architectures, and Applications, 119, 2013 | 29 | 2013 |