An efficient hybrid tridiagonal divide-and-conquer algorithm on distributed memory architectures
作者:
Highlights:
•
摘要
In this paper, we propose an efficient divide-and-conquer (DC) algorithm for symmetric tridiagonal matrices based on ScaLAPACK and the hierarchically semiseparable (HSS) matrices. HSS is an important type of rank-structured matrices. The most computationally intensive part of the DC algorithm is computing the eigenvectors via matrix–matrixmultiplications (MMM). In our parallel hybrid DC (PHDC) algorithm, MMM is accelerated by using HSS matrix techniques when the intermediate matrix is large. All the HSS computations are performed via the package STRUMPACK. PHDC has been tested by using many different matrices. Compared with the DC implementation in MKL, PHDC can be faster for some matrices with few deflations when using hundreds of processes. However, the gains decrease as the number of processes increases. The comparisons of PHDC with ELPA (the Eigenvalue soLvers for Petascale Applications library) are similar. PHDC is usually slower than MKL and ELPA when using 300 or more processes on the Tianhe-2 supercomputer.
论文关键词:65F15, 68W10,ScaLAPACK,Divide-and-conquer,HSS matrix,Distributed parallel algorithm
论文评审过程:Received 9 December 2016, Revised 29 November 2017, Available online 6 June 2018, Version of Record 18 June 2018.
论文官网地址:https://doi.org/10.1016/j.cam.2018.05.051