[Pw_forum] First PWscf GPU-enabled beta-release

Vit vitruss at gmail.com
Thu May 5 18:24:53 CEST 2011

Dear Ivan,
Can you please proveide any benchmarks and comparison of Hybrid CPU/GPU vs 
pure CPU computation? 
With best regards,

Ivan Girotto <ivan.girotto at ichec.ie>
Thursday 05 May 2011
> Dear QE users & developers,
> We are happy to announce that the first beta GPU-enabled release of
> Quantum ESPRESSO (QE) has been committed today in the official repository.
> You can download the new version of the code using the following command:
> $ svn checkout
> svn://scm.qe-forge.org/scmrepos/svn/q-e/branches/espresso-PRACE
> The Irish Centre for High-End Computing (ICHEC, www.ichec.ie
> <http://www.ichec.ie>) has been mainly responsible for extending the QE
> suite to enhance calculations on NVIDIA GPUs. The porting activity has
> been supported within the PRACE 1st Implementation Phase project. It is
> currently carried out through the Sub-task "Accelerator", led by ICHEC,
> within the Work-Package "Programming Techniques for High-Performance
> Applications" in collaboration with CINECA.
> The porting activity is concerned mainly with the PWscf package. But
> ICHEC and the Irish QE user community are interested in exploring any
> other initiatives which come forward from QE users or developers
> interested in porting on GPGPU architecture any of the QE suite related
> codes.
> We have successfully accelerated the linear algebra part of the QE suite
> using a library called phiGEMM, some explicit computational kernels
> (newd, addusdense, vloc_psi) and the 3D FFT for the single CPU/GPU
> version. Both linear algebra (matrix multiplication) and the FFT
> accelerated version make use of CUDA libraries. The porting is mainly
> based on wrappers that permit the use of libraries for accelerators. The
> distributed 3D FFT version is still in progress, since this porting
> requires important changes of the current structure of the code and data
> distribution. While running the parallel and distributed multi-GPUs
> version it still uses the original 3D FFT implementations.
> The phiGEMM library is distributed as an independent open-source
> external package together with the Quantum ESPRESSO suite. It aims to
> perform matrix multiplication ([SDZ]GEMM) taking advantage of the
> underlying BLAS kernel functions on both CPU and NVIDIA CUDA-based GPU,
> mixing and overlapping computation between the host (CPU) and the
> accelerator (GPU). Whatever code makes intensive use of GEMM it can
> derive a significant advantage linking this library when running on a
> CPU/GPU hybrid system.
> Even if the 3D FFT is accelerated only for a single CPU process (not
> when using MPI), other parts are already enabled to take advantage of a
> distributed parallel hybrid system. All of this allows PWscf to
> potentially use distributed message-passing parallelization (MPI) plus
> multi-threading (OpenMP) plus accelerators (NVIDIA GPUs) all together
> and produce good performance enhancement using the latest version of
> NVIDIA GPUs (high performance double precision is needed). This porting
> activity is still in progress, especially the parallel 3D FFT component
> that represents a bottleneck for large calculations. We have been
> testing this beta release using some small/medium benchmarks used in the
> DEISA official bench-suite and several GPU hardware (Tesla and Fermi
> architectures). Special thanks goes to both E4 Computer Engineering and
> CEA for providing access to hybrid GPU systems with differing
> configurations to those available at ICHEC.
> We look forward with interest to receiving any suggestions for
> improvement, feedback or request for collaboration by anyone who is
> interested to try and validate PWscf CUDA version on different platforms
> using different scientific cases.For additional information please
> contact qe-gpu at ichec.ie or replay at this mail. We'll be shortly
> available a dedicated forum q-e-gpgpu at qe-forge.org
> <http://qe-forge.org/mail/?group_id=10>. Please use this list for bug
> report and any other issue related to the use of the PWscf GPU version.
> Although compilation of the GPU implementation is fairly
> straight-forward, we kindly suggest that users carefully read the
> README.GPU that is included. The intrinsic characteristics of hybrid
> multi- and many-core systems require careful consideration to best
> exploit the available computing power.
> Any and all suggestions are welcome and will be very much appreciated.
> Ivan Girotto & Filippo Spiga
> ---
> ICHEC GPU developer team
> The Tower - 7th floor
> Trinity Technology&  Enterprise Campus
> Grand Canal Quay - Dublin 2 - Ireland
> +353-1-5241608 (ph) / +353-1-7645845 (fax)

More information about the users mailing list