[QE-users] Ph.x crashing on multiple nodes

Baer, Bradly bradly.b.baer at Vanderbilt.Edu
Mon Dec 28 19:47:01 CET 2020


Hello all,

 I am experiencing a crash when working with ph.x across multiple nodes.  Input and output files are attached.  The first q-point appears to be calculated correctly, but the code crashes when attempting to start calculating the second q-point. A file "charge-density" is said to be missing but "charge-density.dat" exists when I manually inspect the files.  As there are 16 reports that the file cannot be found, I am assuming that this is an issue with me using multiple nodes (each node has 16 cores).  A general description of my computing environment and workflow follows:

I am using SLURM on a cluster.  I have two nodes assigned to my job, each with a local scratch drive that is not visible to the other node.  I also have access to a gpfs networked drive that both nodes can access.  To improve performance, I am attempting to perform all calculations using the local scratch drives. All input files are copied from the gpfs networked drive to the local drive on each node before the initial pw.x calculation.  After the pw.x calculation, a small script copies the output files (pwscf.save folder and pwscf.xml) from the first node to the networked drive and then a second script copies them from the networked drive to the second node before starting the phonon code.

I am open to any suggestions as this solution has been somewhat hacked together after performance using the gpfs networked drives proved incredibly poor.

Thanks,
Brad


--------------------------------------------------------
Bradly Baer
Graduate Research Assistant, Walker Lab
Interdisciplinary Materials Science
Vanderbilt University


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201228/f0591008/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Phonon.out
Type: application/octet-stream
Size: 35229 bytes
Desc: Phonon.out
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201228/f0591008/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: QESlurm.slurm
Type: application/octet-stream
Size: 1026 bytes
Desc: QESlurm.slurm
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201228/f0591008/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Phonon.in
Type: application/octet-stream
Size: 175 bytes
Desc: Phonon.in
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20201228/f0591008/attachment-0002.obj>


More information about the users mailing list