[QE-users] GPU enabled QE not running

Pietro Bonfa' pietro.bonfa at unipr.it
Thu Dec 31 16:00:39 CET 2020


Dear Rahul,

that seems indeed to be the case and I can only guess that the 
environment seen by QE is different.

It's easy to check, can you compile and run this trivial program?

program ma
implicit none
integer, device :: a(1000)
integer :: i
a=1
!$cuf kernel do
do i=1,1000
a(i)=2
enddo
end

This is what i get with V100 and driver version: 440.64.00

$ cat b.f90
program ma
implicit none
integer, device :: a(1000)
integer :: i
a=1
!$cuf kernel do
do i=1,1000
a(i)=2
enddo
end

$ pgf90  -Mcuda=cc70,cuda11.0 b.f90
$ ./a.out
cudaGetDevice returned status 35: CUDA driver version is insufficient 
for CUDA runtime version
$ pgf90  -Mcuda=cc70,cuda10.2 b.f90
$ ./a.out
$


Best regards,
Pietro



On 12/31/20 2:08 PM, Rahul Verma wrote:
> Hi Pietro,
> 
> Thank you but, as per this this article
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.nvidia.com%2Fdeploy%2Fcuda-compatibility%2Findex.html&data=04%7C01%7Cpietro.bonfa%40unipr.it%7C8cf835bfa1e9438b9ba408d8ad8d44f7%7Cbb064bc5b7a841ecbabed7beb3faeb1c%7C0%7C0%7C637450169551871426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Kd%2F03UIG0Pka9tm1AwlK0nVBH7MJ4sXrTRlpHdFK%2BJs%3D&reserved=0
> 
> NVIDIA driver version 418 is compatible with cuda version 10.1 and 10.2.
> 
> Following the output from the command "nvidia-smi"
> +-----------------------------------------------------------------------------+
> | NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1
>     |
> |-------------------------------+----------------------+----------------------+
> | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr.
> ECC |
> | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute
> M. |
> |===============================+======================+======================|
> |   0  Tesla V100-PCIE...  Off  | 00000000:5E:00.0 Off |
> Off |
> | N/A   40C    P0    36W / 250W |      0MiB / 16130MiB |      0%
> Default |
> +-------------------------------+----------------------+----------------------+
> 
> +-----------------------------------------------------------------------------+
> | Processes:                                                       GPU
> Memory |
> |  GPU       PID   Type   Process name                             Usage
>     |
> |=============================================================================|
> |  No running processes found
>     |
> +-----------------------------------------------------------------------------+
> 
> 
> 
>> Dear Rahul,
>>
>> that is useful, you can lookup error codes of type 9xxx here:
>>
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.nvidia.com%2Fcuda%2Fcuda-runtime-api%2Fgroup__CUDART__TYPES.html%23group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038&data=04%7C01%7Cpietro.bonfa%40unipr.it%7C8cf835bfa1e9438b9ba408d8ad8d44f7%7Cbb064bc5b7a841ecbabed7beb3faeb1c%7C0%7C0%7C637450169551871426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=vLdn%2BfPmfcYUEIetEwNvscFji1iO3Bns0maTirDTCdA%3D&reserved=0
>>
>> and you'll find
>>
>> cudaErrorInsufficientDriver = 35
>>       This indicates that the installed NVIDIA CUDA driver is older than
>> the CUDA runtime library. This is not a supported configuration. Users
>> should install an updated NVIDIA display driver to allow the application
>> to run.
>>
>> The driver installed on the machine is apparently too old also for CUDA
>> 10.2. Can you switch to older versions of the CUDA runtime? Otherwise
>> you'll need to interact with the system's administrator.
>>
>> Best regards,
>> Pietro
>>
>>
>> On 12/31/20 1:26 PM, Rahul Verma wrote:
>>> Hi Pietro,
>>>     I have tried with configure option --disable-parallel but now, I am
>>> getting message passing error..
>>>
>>>
>>> *** error in Message Passing (mp) module ***
>>> *** error code:  9035
>>>
>>> I have also changed the cuda version to 10.2 from 11.1 but the error is
>>> same.
>>>
>>>
>>>> Dear Rahul Verma,
>>>>
>>>> I would try the serial version first. Can you reconfigure and compile
>>>> with the additional option
>>>>
>>>> --disable-parallel
>>>>
>>>> and check if the misbehavior persists?
>>>>
>>>> Best regards,
>>>> Pietro
>>>>
>>>>
>>>> On 12/31/20 7:50 AM, Rahul Verma wrote:
>>>>> I have installed GPU enables QE package in a Volta V100 machine.
>>>>> Unfortunately, while running pw.x it is not printing any output.
>>>>>
>>>>> For reference I am attaching config.log and library dependency output
>>>>> with
>>>>> this email.
>>>>>
>>>>> Thank you
>>>>>
>>>>> Regards
>>>>> Rahul Verma
>>>>> Department of Chemistry
>>>>> IIT Kanpur
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Quantum ESPRESSO is supported by MaX
>>>>> (https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=04%7C01%7Cpietro.bonfa%40unipr.it%7C8cf835bfa1e9438b9ba408d8ad8d44f7%7Cbb064bc5b7a841ecbabed7beb3faeb1c%7C0%7C0%7C637450169551871426%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Rr533lvt%2FQ3o9nvne%2BhcIVT7LhcHl4TTlUgKi0rwOh4%3D&reserved=0)
>>>>> users mailing list users at lists.quantum-espresso.org
>>>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Cpietro.bonfa%40unipr.it%7C8cf835bfa1e9438b9ba408d8ad8d44f7%7Cbb064bc5b7a841ecbabed7beb3faeb1c%7C0%7C0%7C637450169551881417%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sULpYjdpWgbkQePCztswZn33vaIPsKK%2Be5BSuu%2FIcPA%3D&reserved=0
>>>>>
>>>> _______________________________________________
>>>> Quantum ESPRESSO is supported by MaX
>>>> (https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=04%7C01%7Cpietro.bonfa%40unipr.it%7C8cf835bfa1e9438b9ba408d8ad8d44f7%7Cbb064bc5b7a841ecbabed7beb3faeb1c%7C0%7C0%7C637450169551881417%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZcsuooyIuPUhTfF2WHg323x7QM52zu9OPSBUQglW8XY%3D&reserved=0)
>>>> users mailing list users at lists.quantum-espresso.org
>>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Cpietro.bonfa%40unipr.it%7C8cf835bfa1e9438b9ba408d8ad8d44f7%7Cbb064bc5b7a841ecbabed7beb3faeb1c%7C0%7C0%7C637450169551881417%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sULpYjdpWgbkQePCztswZn33vaIPsKK%2Be5BSuu%2FIcPA%3D&reserved=0
>>>>
>>>
>>>
>> _______________________________________________
>> Quantum ESPRESSO is supported by MaX (https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=04%7C01%7Cpietro.bonfa%40unipr.it%7C8cf835bfa1e9438b9ba408d8ad8d44f7%7Cbb064bc5b7a841ecbabed7beb3faeb1c%7C0%7C0%7C637450169551881417%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ZcsuooyIuPUhTfF2WHg323x7QM52zu9OPSBUQglW8XY%3D&reserved=0)
>> users mailing list users at lists.quantum-espresso.org
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Cpietro.bonfa%40unipr.it%7C8cf835bfa1e9438b9ba408d8ad8d44f7%7Cbb064bc5b7a841ecbabed7beb3faeb1c%7C0%7C0%7C637450169551881417%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=sULpYjdpWgbkQePCztswZn33vaIPsKK%2Be5BSuu%2FIcPA%3D&reserved=0
>>
> 
> 


More information about the users mailing list