[QE-users] Running efficiently on multiple nodes
Baer, Bradly
bradly.b.baer at Vanderbilt.Edu
Fri Nov 6 00:39:48 CET 2020
Paolo,
I believe the nodes I am using have gigabit connections. There are additional nodes that have 10 or 25 gigabit connections but I don't think I would land on one of them without specifically requesting them. What communication speed would be appropriate for QE's needs?
I also did consider trying to manually set the parallelization but I don't currently know enough about SLURM to identify each node and ensure that all 16 cores assigned from a pool are on the same node. I will keep it in mind though as a possible future solution.
Thanks,
Brad
--------------------------------------------------------
Bradly Baer
Graduate Research Assistant, Walker Lab
Interdisciplinary Materials Science
Vanderbilt University
________________________________
From: Paolo Giannozzi <p.giannozzi at gmail.com>
Sent: Thursday, November 5, 2020 3:54 PM
To: Baer, Bradly <bradly.b.baer at Vanderbilt.Edu>; Quantum ESPRESSO users Forum <users at lists.quantum-espresso.org>
Subject: Re: [QE-users] Running efficiently on multiple nodes
Are there fast communications between the two nodes? if not, the parallel distributed 3D FFT will be very slow (note the time taken by fft_scatt_yz). You might find convenient to exploit k-point parallelization, that requires much less communication: for instance, "mpirun -n 32 pw.x -nk 2" (2 pools of 16 processors, each pool performing parallel FFT), but you have to figure out a way to convince the first pool of 16 processors on node 1, the second on node 2 (or vice versa, as long as FFT parallelization happens inside a node, k-point parallelization across nodes )
Paolo
On Thu, Nov 5, 2020 at 7:29 PM Baer, Bradly via users <users at lists.quantum-espresso.org<mailto:users at lists.quantum-espresso.org>> wrote:
Paolo,
Thank you for your suggestion. I will add recompiling to move to 6.6 to my to do list. For now, I corrected the pseudopotential files as you indicated and the calculation ran successfully. It has become slightly faster, but still much slower than running on a single node (3:30s vs 0:30s). Is there more that I should be doing to improve performance or is my test problem too small to see the benefits of parallelization?
Thanks,
Brad
--------------------------------------------------------
Bradly Baer
Graduate Research Assistant, Walker Lab
Interdisciplinary Materials Science
Vanderbilt University
________________________________
From: users <users-bounces at lists.quantum-espresso.org<mailto:users-bounces at lists.quantum-espresso.org>> on behalf of Paolo Giannozzi <p.giannozzi at gmail.com<mailto:p.giannozzi at gmail.com>>
Sent: Thursday, November 5, 2020 10:01 AM
To: Quantum ESPRESSO users Forum <users at lists.quantum-espresso.org<mailto:users at lists.quantum-espresso.org>>
Subject: Re: [QE-users] Running efficiently on multiple nodes
On Thu, Nov 5, 2020 at 3:05 PM Baer, Bradly <bradly.b.baer at vanderbilt.edu<mailto:bradly.b.baer at vanderbilt.edu>> wrote:
Pseudo file Ga.pbe-dn-kjpaw_psl.1.0.0.UPF has been fixed on the fly.
To avoid this message in the future, permanently fix
your pseudo files following these instructions:
https://gitlab.com/QEF/q-e/blob/master/upftools/how_to_fix_upf.md<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.com%2FQEF%2Fq-e%2Fblob%2Fmaster%2Fupftools%2Fhow_to_fix_upf.md&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Ca843f95dcbc04eb71ed508d881d5735b%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637402101063299076%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=e33bhAwBlBmEyzlOywulA5VrN6JkxWmXUv6JhSuKtNY%3D&reserved=0>
This is a possible source of trouble if the output directory is not visible to all processors. Please try one of the following:
- do what it is suggested (or simply: edit Ga.pbe-dn-kjpaw_psl.1.0.0.UPF, replace all occurrences of "&" with "&")
- get version 6.6, that reads the pseudopotential file on one processor and broadcast its contents to all other processes
- get the development version, that in addition is not sensitive to the presence of nonstandard "&" in the files,
Paolo
-Brad
--------------------------------------------------------
Bradly Baer
Graduate Research Assistant, Walker Lab
Interdisciplinary Materials Science
Vanderbilt University
________________________________
From: users <users-bounces at lists.quantum-espresso.org<mailto:users-bounces at lists.quantum-espresso.org>> on behalf of Paolo Giannozzi <p.giannozzi at gmail.com<mailto:p.giannozzi at gmail.com>>
Sent: Thursday, November 5, 2020 2:33 AM
To: Quantum ESPRESSO users Forum <users at lists.quantum-espresso.org<mailto:users at lists.quantum-espresso.org>>
Subject: Re: [QE-users] Running efficiently on multiple nodes
On Wed, Nov 4, 2020 at 11:28 PM Baer, Bradly <bradly.b.baer at vanderbilt.edu<mailto:bradly.b.baer at vanderbilt.edu>> wrote:
Now that I have two nodes, the script for a single node results in a crash shortly after reading in the pseudopotentials.
which version of QE are you using, and which crash do you obtain, with which executable?
Paolo
--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
_________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Ca843f95dcbc04eb71ed508d881d5735b%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637402101063309070%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=3WAEQsQAKgsnqkk%2FRpxTFQrgj0C1Fmm6ekNNZ2HkGyY%3D&reserved=0>)
users mailing list users at lists.quantum-espresso.org<mailto:users at lists.quantum-espresso.org>
https://lists.quantum-espresso.org/mailman/listinfo/users<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Ca843f95dcbc04eb71ed508d881d5735b%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637402101063309070%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2BIu52jsZQs6b%2FY%2Fk11ZBc%2FxC0xr2c8aOlNDvbJLo5rE%3D&reserved=0>
--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
_______________________________________________
Quantum ESPRESSO is supported by MaX (www.max-centre.eu<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Ca843f95dcbc04eb71ed508d881d5735b%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637402101063319066%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YLzRslbTYz%2B8EObze6WnE6SKsrCIzJUeXvyHYvr7ZOU%3D&reserved=0>)
users mailing list users at lists.quantum-espresso.org<mailto:users at lists.quantum-espresso.org>
https://lists.quantum-espresso.org/mailman/listinfo/users<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Ca843f95dcbc04eb71ed508d881d5735b%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637402101063319066%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=G2BtHxmpAJL4WDxv06ANzYYj4YGSJgOYqaEhE3GLNPg%3D&reserved=0>
--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201105/17acacf7/attachment.html>
More information about the users
mailing list