[QE-users] Running efficiently on multiple nodes
Michal Krompiec
michal.krompiec at gmail.com
Fri Nov 6 01:04:44 CET 2020
Dear Brad,
Fast communications means here Infiniband or other RDMA. Make sure your MPI
uses RDMA, I’ve seen systems where it isn’t enabled by default. That said,
if you use k-point parallelization you can get away with gigabit ethernet
as Paolo mentioned.
Best wishes,
Michal Krompiec
Merck KGaA
On Thu, Nov 5, 2020 at 11:40 PM Baer, Bradly via users <
users at lists.quantum-espresso.org> wrote:
> Paolo,
>
> I believe the nodes I am using have gigabit connections. There are
> additional nodes that have 10 or 25 gigabit connections but I don't think I
> would land on one of them without specifically requesting them. What
> communication speed would be appropriate for QE's needs?
>
> I also did consider trying to manually set the parallelization but I don't
> currently know enough about SLURM to identify each node and ensure that all
> 16 cores assigned from a pool are on the same node. I will keep it in mind
> though as a possible future solution.
>
> Thanks,
> Brad
>
> --------------------------------------------------------
> Bradly Baer
> Graduate Research Assistant, Walker Lab
> Interdisciplinary Materials Science
> Vanderbilt University
>
>
> ------------------------------
> *From:* Paolo Giannozzi <p.giannozzi at gmail.com>
> *Sent:* Thursday, November 5, 2020 3:54 PM
> *To:* Baer, Bradly <bradly.b.baer at Vanderbilt.Edu>; Quantum ESPRESSO users
> Forum <users at lists.quantum-espresso.org>
>
> *Subject:* Re: [QE-users] Running efficiently on multiple nodes
>
> Are there fast communications between the two nodes? if not, the parallel
> distributed 3D FFT will be very slow (note the time taken by fft_scatt_yz).
> You might find convenient to exploit k-point parallelization, that requires
> much less communication: for instance, "mpirun -n 32 pw.x -nk 2" (2 pools
> of 16 processors, each pool performing parallel FFT), but you have to
> figure out a way to convince the first pool of 16 processors on node 1, the
> second on node 2 (or vice versa, as long as FFT parallelization happens
> inside a node, k-point parallelization across nodes )
>
> Paolo
>
> On Thu, Nov 5, 2020 at 7:29 PM Baer, Bradly via users <
> users at lists.quantum-espresso.org> wrote:
>
> Paolo,
>
> Thank you for your suggestion. I will add recompiling to move to 6.6 to
> my to do list. For now, I corrected the pseudopotential files as you
> indicated and the calculation ran successfully. It has become slightly
> faster, but still much slower than running on a single node (3:30s vs
> 0:30s). Is there more that I should be doing to improve performance or is
> my test problem too small to see the benefits of parallelization?
>
> Thanks,
> Brad
>
> --------------------------------------------------------
> Bradly Baer
> Graduate Research Assistant, Walker Lab
> Interdisciplinary Materials Science
> Vanderbilt University
>
>
> ------------------------------
> *From:* users <users-bounces at lists.quantum-espresso.org> on behalf of
> Paolo Giannozzi <p.giannozzi at gmail.com>
> *Sent:* Thursday, November 5, 2020 10:01 AM
> *To:* Quantum ESPRESSO users Forum <users at lists.quantum-espresso.org>
> *Subject:* Re: [QE-users] Running efficiently on multiple nodes
>
> On Thu, Nov 5, 2020 at 3:05 PM Baer, Bradly <bradly.b.baer at vanderbilt.edu>
> wrote:
>
>
> *Pseudo file Ga.pbe-dn-kjpaw_psl.1.0.0.UPF has been fixed on the fly.*
> *To avoid this message in the future, permanently fix *
> * your pseudo files following these instructions: *
> *https://gitlab.com/QEF/q-e/blob/master/upftools/how_to_fix_upf.md
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.com%2FQEF%2Fq-e%2Fblob%2Fmaster%2Fupftools%2Fhow_to_fix_upf.md&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Ca843f95dcbc04eb71ed508d881d5735b%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637402101063299076%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=e33bhAwBlBmEyzlOywulA5VrN6JkxWmXUv6JhSuKtNY%3D&reserved=0>*
>
>
> This is a possible source of trouble if the output directory is not
> visible to all processors. Please try one of the following:
> - do what it is suggested (or simply: edit Ga.pbe-dn-kjpaw_psl.1.0.0.UPF,
> replace all occurrences of "&" with "&")
> - get version 6.6, that reads the pseudopotential file on one processor
> and broadcast its contents to all other processes
> - get the development version, that in addition is not sensitive to the
> presence of nonstandard "&" in the files,
>
> Paolo
>
>
>
> -Brad
>
> --------------------------------------------------------
> Bradly Baer
> Graduate Research Assistant, Walker Lab
> Interdisciplinary Materials Science
> Vanderbilt University
>
>
> ------------------------------
> *From:* users <users-bounces at lists.quantum-espresso.org> on behalf of
> Paolo Giannozzi <p.giannozzi at gmail.com>
> *Sent:* Thursday, November 5, 2020 2:33 AM
> *To:* Quantum ESPRESSO users Forum <users at lists.quantum-espresso.org>
> *Subject:* Re: [QE-users] Running efficiently on multiple nodes
>
> On Wed, Nov 4, 2020 at 11:28 PM Baer, Bradly <bradly.b.baer at vanderbilt.edu>
> wrote:
>
>
> Now that I have two nodes, the script for a single node results in a crash
> shortly after reading in the pseudopotentials.
>
>
> which version of QE are you using, and which crash do you obtain, with
> which executable?
>
> Paolo
> --
> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> <https://www.google.com/maps/search/Udine,+via+delle+Scienze+208,+33100+Udine,+Italy?entry=gmail&source=g>
> Phone +39-0432-558216, fax +39-0432-558222
>
> _________________
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu
> <https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Ca843f95dcbc04eb71ed508d881d5735b%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637402101063309070%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=3WAEQsQAKgsnqkk%2FRpxTFQrgj0C1Fmm6ekNNZ2HkGyY%3D&reserved=0>
> )
> users mailing list users at lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Ca843f95dcbc04eb71ed508d881d5735b%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637402101063309070%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2BIu52jsZQs6b%2FY%2Fk11ZBc%2FxC0xr2c8aOlNDvbJLo5rE%3D&reserved=0>
>
>
>
> --
> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> <https://www.google.com/maps/search/Udine,+via+delle+Scienze+208,+33100+Udine,+Italy?entry=gmail&source=g>
> Phone +39-0432-558216, fax +39-0432-558222
>
> _______________________________________________
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu
> <https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.max-centre.eu%2F&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Ca843f95dcbc04eb71ed508d881d5735b%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637402101063319066%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YLzRslbTYz%2B8EObze6WnE6SKsrCIzJUeXvyHYvr7ZOU%3D&reserved=0>
> )
> users mailing list users at lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.quantum-espresso.org%2Fmailman%2Flistinfo%2Fusers&data=04%7C01%7Cbradly.b.baer%40vanderbilt.edu%7Ca843f95dcbc04eb71ed508d881d5735b%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C637402101063319066%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=G2BtHxmpAJL4WDxv06ANzYYj4YGSJgOYqaEhE3GLNPg%3D&reserved=0>
>
>
>
> --
> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
> <https://www.google.com/maps/search/Udine,+via+delle+Scienze+208,+33100+Udine,+Italy?entry=gmail&source=g>
> Phone +39-0432-558216, fax +39-0432-558222
>
> _______________________________________________
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
> users mailing list users at lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201106/64532fc3/attachment.html>
More information about the users
mailing list