<div dir="ltr"><div>Dear all,</div><div>I am trying to compile the 6.7 version of the code using PGI 2020.</div><div>I followed these steps:</div><div><br></div><div><b>1) NVIDIA driver (NVIDIA-Linux-x86_64-450.80.02.rpm) is installed.</b></div><div><b>the output of nvidia-smi:</b></div><div><br></div><div>Wed Dec 16 09:07:11 2020<br>+-----------------------------------------------------------------------------+<br>| NVIDIA-SMI 450.80.02    Driver Version: 450.80.02    CUDA Version: 11.0     |<br>|-------------------------------+----------------------+----------------------+<br>| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |<br>| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |<br>|                               |                      |               MIG M. |<br>|===============================+======================+======================|<br>|   0  TITAN V             Off  | 00000000:06:00.0 Off |                  N/A |<br>| 27%   37C    P0    32W / 250W |      0MiB / 12066MiB |      0%      Default |<br>|                               |                      |                  N/A |<br>+-------------------------------+----------------------+----------------------+<br>|   1  TITAN V             Off  | 00000000:07:00.0 Off |                  N/A |<br>| 25%   37C    P0    35W / 250W |      0MiB / 12066MiB |      0%      Default |<br>|                               |                      |                  N/A |<br>+-------------------------------+----------------------+----------------------+<br><br>+-----------------------------------------------------------------------------+<br>| Processes:                                                                  |<br>|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |<br>|        ID   ID                                                   Usage      |<br>|=============================================================================|<br>|  No running processes found                                                 |<br>+-----------------------------------------------------------------------------+<br></div><div><br></div><div><b>The output of pgaccelinfo:</b></div><div><br></div><div>CUDA Driver Version:           11000<br>NVRM version:                  NVIDIA UNIX x86_64 Kernel Module  450.80.02  Wed Sep 23 01:13:39 UTC 2020<br><br>Device Number:                 0<br>Device Name:                   TITAN V<br>Device Revision Number:        7.0<br>Global Memory Size:            12652838912<br>Number of Multiprocessors:     80<br>Concurrent Copy and Execution: Yes<br>Total Constant Memory:         65536<br>Total Shared Memory per Block: 49152<br>Registers per Block:           65536<br>Warp Size:                     32<br>Maximum Threads per Block:     1024<br>Maximum Block Dimensions:      1024, 1024, 64<br>Maximum Grid Dimensions:       2147483647 x 65535 x 65535<br>Maximum Memory Pitch:          2147483647B<br>Texture Alignment:             512B<br>Clock Rate:                    1455 MHz<br>Execution Timeout:             No<br>Integrated Device:             No<br>Can Map Host Memory:           Yes<br>Compute Mode:                  default<br>Concurrent Kernels:            Yes<br>ECC Enabled:                   No<br>Memory Clock Rate:             850 MHz<br>Memory Bus Width:              3072 bits<br>L2 Cache Size:                 4718592 bytes<br>Max Threads Per SMP:           2048<br>Async Engines:                 7<br>Unified Addressing:            Yes<br>Managed Memory:                Yes<br>Concurrent Managed Memory:     Yes<br>Preemption Supported:          Yes<br>Cooperative Launch:            Yes<br>  Multi-Device:                Yes<br>Default Target:                cc70<br><br>Device Number:                 1<br>Device Name:                   TITAN V<br>Device Revision Number:        7.0<br>Global Memory Size:            12652838912<br>Number of Multiprocessors:     80<br>Concurrent Copy and Execution: Yes<br>Total Constant Memory:         65536<br>Total Shared Memory per Block: 49152<br>Registers per Block:           65536<br>Warp Size:                     32<br>Maximum Threads per Block:     1024<br>Maximum Block Dimensions:      1024, 1024, 64<br>Maximum Grid Dimensions:       2147483647 x 65535 x 65535<br>Maximum Memory Pitch:          2147483647B<br>Texture Alignment:             512B<br>Clock Rate:                    1455 MHz<br>Execution Timeout:             No<br>Integrated Device:             No<br>Can Map Host Memory:           Yes<br>Compute Mode:                  default<br>Concurrent Kernels:            Yes<br>ECC Enabled:                   No<br>Memory Clock Rate:             850 MHz<br>Memory Bus Width:              3072 bits<br>L2 Cache Size:                 4718592 bytes<br>Max Threads Per SMP:           2048<br>Async Engines:                 7<br>Unified Addressing:            Yes<br>Managed Memory:                Yes<br>Concurrent Managed Memory:     Yes<br>Preemption Supported:          Yes<br>Cooperative Launch:            Yes<br>  Multi-Device:                Yes<br>Default Target:                cc70</div><div><br></div><div><b>2) PGI compiler is installed:</b></div><div><b>yum install nvhpc-20-11-20.11-1.x86_64.rpm nvhpc-2020-20.11-1.x86_64.rpm<br></b></div><div><b>PATHs that are set in ~/.bashrc file:<br></b></div><div><b><br></b></div><div>export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.1/bin:$PATH<br>export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.1/include:$PATH<br>export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.1/lib64:$LD_LIBRARY_PATH<br>export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.1/extras/CUPTI/lib64:$LD_LIBRARY_PATH<br>export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/11.1/lib64/stubs:$LD_LIBRARY_PATH</div><div>NVARCH=`uname -s`_`uname -m`; export NVARCH<br>NVCOMPILERS=/opt/nvidia/hpc_sdk; export NVCOMPILERS<br>MANPATH=$MANPATH:$NVCOMPILERS/$NVARCH/20.11/compilers/man; export MANPATH<br>PATH=$NVCOMPILERS/$NVARCH/20.11/compilers/bin:$PATH; export PATH<br>PATH=$NVCOMPILERS/$NVARCH/20.11/compilers/include:$PATH; export PATH<br>LD_LIBRARY_PATH=$NVCOMPILERS/$NVARCH/20.11/compilers/lib:$PATH; export LD_LIBRARY_PATH</div><div>export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/comm_libs/mpi/bin:$PATH<br>export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/comm_libs/mpi/include:$PATH<br>export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/comm_libs/mpi/lib:$LD_LIBRARY_PATH<br>export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/math_libs/11.1/lib64:$LD_LIBRARY_PATH<br>export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/math_libs/11.1/lib64/stubs:$LD_LIBRARY_PATH<b><br></b></div><div><b><br></b></div><div><b>3) compiling the code using:</b></div><div><b>./configure FC=pgf90 F90=pgf90 F77=pgf90 CC=pgcc MPIF90=mpif90 --with-cuda=/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda --with-cuda-runtime=11.1 --with-cuda-cc=70 --enable-openmp --with-scalapack=no</b></div><div><br></div><div>checking build system type... x86_64-pc-linux-gnu<br>checking ARCH... x86_64<br>checking setting AR... ... ar<br>checking setting ARFLAGS... ... ruv<br>checking whether the Fortran compiler works... yes<br>checking for Fortran compiler default output file name... a.out<br>checking for suffix of executables...<br>checking whether we are cross compiling... no<br>checking for suffix of object files... o<br>checking whether we are using the GNU Fortran compiler... no<br>checking whether pgf90 accepts -g... yes<br>configure: WARNING: F90 value is set to be consistent with value of MPIF90<br>checking for mpif90... mpif90<br>checking whether we are using the GNU Fortran compiler... no<br>checking whether mpif90 accepts -g... yes<br>checking version of mpif90... nvfortran 20.11-0<br>checking for Fortran flag to compile .f90 files... none<br>setting F90... nvfortran<br>setting MPIF90... mpif90<br>checking whether we are using the GNU C compiler... yes<br>checking whether pgcc accepts -g... yes<br>checking for pgcc option to accept ISO C89... none needed<br>setting CC... pgcc<br>setting CFLAGS... -fast -Mpreprocess<br>using F90... nvfortran<br>setting FFLAGS... -O1<br>setting F90FLAGS... $(FFLAGS)<br>setting FFLAGS_NOOPT... -O0<br>setting CPP... cpp<br>setting CPPFLAGS... -P -traditional -Uvector<br>setting LD... mpif90<br>setting LDFLAGS...<br>checking for Fortran flag to compile .f90 files... (cached) none<br>checking whether Fortran compiler accepts -Mcuda=cuda11.1... yes<br>checking for nvcc... /opt/nvidia/hpc_sdk/Linux_x86_64/20.11/compilers/bin/nvcc<br>checking whether nvcc works... no<br>configure: WARNING: CUDA compiler has problems.<br>checking for cuInit in -lcuda... no<br>configure: error: in `/codes/qe_6.7_GPU/q-e-gpu-qe-gpu-6.7':<br>configure: error: Couldn't find libcuda<br>See `config.log' for more details<b><br></b></div><div><b><br></b></div><div><b><br></b></div><div><b><br></b></div><div><b>Any Help will be greatly appreciated.</b></div><div><b><br></b></div><div><b><br></b></div><div><b><br></b></div><div><b>P.S.</b></div><div><b>When I run nvcc in terminal, the following error appears:</b></div><div>$ which nvcc</div><div>/opt/nvidia/hpc_sdk/Linux_x86_64/20.11/compilers/bin/nvcc</div><div><b></b></div><div>$ nvcc</div><div>nvcc-Error-CUDA version 10.2 was not installed with this HPC SDK: /opt/nvidia/hpc_sdk/Linux_x86_64/20.11/cuda/10.2/bin</div><div><br></div><div><br></div><div>
<div><b>Best,</b></div><div><b>Mohammad Moaddeli</b></div><div><b>Shiraz University</b></div>

</div><div><b></b></div><div><b></b></div></div>