Architecting the finite element method pipeline for the GPU

作者:

Highlights:

摘要

The finite element method (FEM) is a widely employed numerical technique for approximating the solution of partial differential equations (PDEs) in various science and engineering applications. Many of these applications benefit from fast execution of the FEM pipeline. One way to accelerate the FEM pipeline is by exploiting advances in modern computational hardware, such as the many-core streaming processors like the graphical processing unit (GPU). In this paper, we present the algorithms and data-structures necessary to move the entire FEM pipeline to the GPU. First we propose an efficient GPU-based algorithm to generate local element information and to assemble the global linear system associated with the FEM discretization of an elliptic PDE. To solve the corresponding linear system efficiently on the GPU, we implement a conjugate gradient method preconditioned with a geometry-informed algebraic multigrid (AMG) method preconditioner. We propose a new fine-grained parallelism strategy, a corresponding multigrid cycling stage and efficient data mapping to the many-core architecture of GPU. Comparison of our on-GPU assembly versus a traditional serial implementation on the CPU achieves up to an 87× speedup. Focusing on the linear system solver alone, we achieve a speedup of up to 51× versus use of a comparable state-of-the-art serial CPU linear system solver. Furthermore, the method compares favorably with other GPU-based, sparse, linear solvers.

论文关键词:92C05,Finite element method (FEM),Graphical processing units (GPUs),Algebraic multigrid (AMG)

论文评审过程:Received 9 August 2012, Revised 19 April 2013, Available online 6 September 2013.

论文官网地址:https://doi.org/10.1016/j.cam.2013.09.001