[QE-users] Help for compilation with Nvidia HPC SDK

Fri Aug 2 15:33:52 CEST 2024

Dear Fabrizio,
thanks a lot for the help.
Now my compilation works.
Best regards,
Mauro Sgroi.

_______________________

Dr. *Mauro Francesco Sgroi*

Department of Chemistry

University of Turin

Via Quarello 15a

I-10135 TORINO (Italy)

Tel.

+39 011-670-8372

+39 011-670-7364

e-mail: maurofrancesco.sgroi at unito.it

Web:

www.met.unito.it

www.chimica.unito.it

Orcid: https://orcid.org/0000-0002-0914-4217

Webex:
https://unito.webex.com/webappng/sites/unito/dashboard/pmr/maurofrancesco.sgroi

Il giorno ven 2 ago 2024 alle ore 15:13 Fabrizio Ferrari Ruffino <
faferrar at sissa.it> ha scritto:

> The GPU executable can be launched in the same way as the CPU one, but
> considering this:
>
>    - the number of mpi per node must be the same as the number of GPUs (2
>    mpi per node in your case). In principle you can try to use more mpi
>    processes per GPU, but it is not recommended;
>    - you can enable openMP together with GPU (add --enable-openmp to
>    ./configure) in order to exploit CPU threading in the few places where GPU
>    porting is not present (no more than 8 thread per node, generally doesn't
>    make much difference though)
>
>
> I don't know which scheduler is in use in your system, here is an example
> of a batch job in slurm launching on 2 nodes with 2 GPUs:
>
> ------------------------------------------------------------------------------------------------------------
> #!/bin/bash
> #SBATCH --nodes=2
> #SBATCH --ntasks-per-node=2
> #SBATCH --cpus-per-task=1
> #SBATCH --gres=gpu:2
> #SBATCH --time=00:20:00
>
> module purge
> module load hpcsdk/24.3
>
> export* OMP_NUM_THREADS=1*
>
> mpirun -np 4  /home/q-e/bin/pw.x  -nk 1 -nb 1 -input scf.in > scf.out
>
> ---------------------------------------------------------------------------------------------------------------
>
> Hope it helps
> Cheers,
>
> Fabrizio
>
> ------------------------------
> *From:* Mauro Francesco Sgroi <maurofrancesco.sgroi at unito.it>
> *Sent:* Friday, August 2, 2024 2:35 PM
> *To:* Fabrizio Ferrari Ruffino <faferrar at sissa.it>
> *Cc:* Quantum ESPRESSO users Forum <users at lists.quantum-espresso.org>
> *Subject:* Re: [QE-users] Help for compilation with Nvidia HPC SDK
>
> Dear Fabrizio,
> thanks a lot for the explanation.
> I was unsure about how to proceed and worried not to get the proper
> performance on the GPU.
>
> May I ask for help regarding the way of running the code? Where can I find
> instructions on how to launch the executable?
>
> For example, how can I control the number of GPUs used and the parallel
> processes?
>
> I have 2 GPUs for each node.
>
> Thanks a lot and best regards,
> Mauro Sgroi.
>
> _______________________
>
> Dr. *Mauro Francesco Sgroi*
>
> Department of Chemistry
>
> University of Turin
>
> Via Quarello 15a
>
> I-10135 TORINO (Italy)
>
> Tel.
>
> +39 011-670-8372
>
> +39 011-670-7364
>
> e-mail: maurofrancesco.sgroi at unito.it
>
> Web:
>
> www.met.unito.it
>
> www.chimica.unito.it
>
> Orcid: https://orcid.org/0000-0002-0914-4217
>
> Webex:
> https://unito.webex.com/webappng/sites/unito/dashboard/pmr/maurofrancesco.sgroi
>
>
>
> Il giorno ven 2 ago 2024 alle ore 14:11 Fabrizio Ferrari Ruffino <
> faferrar at sissa.it> ha scritto:
>
> Hi,
> there are a few minor FFTXlib calls somewhere  in QE  which are still on
> CPU, therefore it is better to have a CPU fft backend enabled too. Whether
> to use the internal one or FFTW3 should not make much difference, since all
> the main stuff runs on gpu (therefore calling cuFFT).
> In a CPU run the FFTW3 backend is faster than the internal one, but, as I
> said, in a GPU run it should be quite irrelevant.
> Cheers,
>
> Fabrizio
> CNR IOM
> ------------------------------
> *From:* users <users-bounces at lists.quantum-espresso.org> on behalf of
> Mauro Francesco Sgroi via users <users at lists.quantum-espresso.org>
> *Sent:* Friday, August 2, 2024 12:13 PM
> *To:* Quantum ESPRESSO users Forum <users at lists.quantum-espresso.org>
> *Subject:* [QE-users] Help for compilation with Nvidia HPC SDK
>
> Dear all,
> I am trying to compile the 7.3.1 version of Quantum Espresso using the
> last Nvidia HPC SDK (24.7) on Ubuntu 24.04.
>
> I am configuring as follows:
>
> export BLAS_LIBS='-L/opt/nvidia/hpc_sdk/Linux_x86_64/2024/math_libs/lib64
> -lcublas -lcublasLt -L/opt/nvidia/hpc_sdk/Linux_x86_64/2024/compilers/lib
> -lblas -L/opt/nvidia/hpc_sdk/Linux_x86_64/2024/cuda/lib64 -lcudart'
>
> export
> LAPACK_LIBS='-L/opt/nvidia/hpc_sdk/Linux_x86_64/2024/math_libs/lib64
> -lcusolver -lcurand -lcublas -lcublasLt -lcusparse
> -L/opt/nvidia/hpc_sdk/Linux_x86_64/2024/compilers/lib -llapack
> -L/opt/nvidia/hpc_sdk/Linux_x86_64/2024/cuda/lib64 -lcudart'
>
> export
> SCALAPACK_LIBS='-/opt/nvidia/hpc_sdk/Linux_x86_64/2024/comm_libs/12.5/openmpi4/openmpi-4.1.5/lib
> -lscalapack
> -L/opt/nvidia/hpc_sdk/Linux_x86_64/2024/comm_libs/12.5/openmpi4/latest/lib
> -lmpi -lopen-pal'
>
> ./configure --with-cuda=/opt/nvidia/hpc_sdk/Linux_x86_64/2024/cuda/12.5
> --with-cuda-cc=75 --with-cuda-runtime=12.5 --with-cuda-mpi=yes
>
> In this way, the internal FFTW library is selected. Should I use the FFTW3
> library together with cufft?
>
> Can the two libraries work together? Is it normal that the internal FFTW
> library is used? Or should the cufft library be sufficient?
>
> Or is it better to use the cufftw library supplied by NVIDIA?
>
> Can I have some guidance on these aspects?
>
> Thanks a lot and best regards,
> Mauro Sgroi.
>
> _______________________
>
> Dr. *Mauro Francesco Sgroi*
>
> Department of Chemistry
>
> University of Turin
>
> Via Quarello 15a
>
> I-10135 TORINO (Italy)
>
> Tel.
>
> +39 011-670-8372
>
> +39 011-670-7364
>
> e-mail: maurofrancesco.sgroi at unito.it
>
> Web:
>
> www.met.unito.it
>
> www.chimica.unito.it
>
> Orcid: https://orcid.org/0000-0002-0914-4217
>
> Webex:
> https://unito.webex.com/webappng/sites/unito/dashboard/pmr/maurofrancesco.sgroi
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20240802/aa6c11e6/attachment.html>