[Q-e-developers] hybrid functionals (CVS version of Q-E)

josepht at chips.ncsu.edu josepht at chips.ncsu.edu
Mon Apr 26 22:37:33 CEST 2010


Dear developers,

After running the examples, I am further testing the EXX code in Q-E. Runs were
performed on an IBM AIX machine using 4-16 nodes with 4-8 cores each.  PWSCF was
compiled following the instructions ("DEXX" flag).  Pools parallelization was NOT
attempted, not even for nqx=gamma.  K-grids were generated automatically (both
shifted and unshifted).  Input and output files are in ssave.tar, attached.

The following cases were tested (cases 3 (nimages test), 6 and 10 failed):

case 1: =======================
2 Si cell, HSE, k=8x8x8, nqx=8x8x8
16x4 CPU, 1 image
STATUS: success
CPU TIME:  2h19m
Input:  same as case 2, but with 8x8x8 k-points
Output:  Save.scf_888000.out

case 2: =======================
2 Si cell, HSE, k=12x12x12, nqx=8x8x8
16x4 CPU, 1 image
STATUS:  success
CPU TIME:  12h17m
Input:  Save.scf_121212000.in
Output:  Save.scf_121212000.out

case 3: =======================
2 Si cell, HSE, k=4x4x4, nqx=2x2x2
16x4 CPU, 2 images (-nimage 2 at runtime)
STATUS:  "success", but no direct output! (failed)
CPU TIME:  23.76s
Input:  Save.scf444111_image.in
Output:  Save.scf444111_image.out

case 4: =======================
2 Si cell, HSE, k=4x4x4, nqx=2x2x2
16x4 CPU, 1 image
STATUS:  success
CPU TIME:   55.37s
Input:  Save.scf444111.in
Output:  Save.scf444111.out

case 5: =======================
8 Si cell, PBE, k=4x4x4, nqx=N/A
8x4 CPU, 1 image
STATUS:  success
CPU TIME:  14.11s
Input:  see case 6 input, minus hybrid parameters
Output:  nothing unusual wrt Q-E standard release

case 6: =======================
8 Si cell, HSE, k=4x4x4, nqx=1x1x1
4x8 CPU, 1 image
STATUS:  failed
CPU TIME:  N/A
Input:  Save.HSE8scf444111.in
Output:  Save.HSE8scf444111.out

case 7: =======================
8 Si cell, HSE, k=2x2x2, nqx=1x1x1
4x8 CPU, 1 image
STATUS:  success
CPU TIME:  11.58s
Input:  Save.HSE8scf222111a.in
Output:  Save.HSE8scf222111a.out

case 8: =======================
8 Si cell, HSE, k=2x2x2, nqx=2x2x2
4x8 CPU, 1 image
STATUS:   success
CPU TIME:  1m 6.83s
Input:  Save.HSE8scf222111b.in
Output:  Save.HSE8scf222111b.out

case 9: =======================
64 Si cell, HSE, k=2x2x2, nqx=2x2x2
8x8 CPU, 1 image
STATUS:  success
CPU TIME:  5h56m
Input:  Save.64scf222111b.in
Output:  Save.64scf222111b.out

case 10: ======================
64 Si cell, HSE, k=4x4x4, nqx=4x4x4
8x8 CPU, 1 image
STATUS:  failed
CPU TIME:  N/A
Input:  Save.64scf444111.in
Output:  Save.64scf444111.out


############ COMMENTS ##############

HSE "images" parallelization doesn't quite work, as no useful output is piped. 
Though, if it is a "simple" i/o matter and the calculation is indeed being performed
properly aside from writeout, the speedup is quite dramatic! :-)

In cases 6 and 10, the run terminates with "problems computing cholesky
decomposition" piped to 'CRASH', but I think that is secondary to the real problem:

===========
     highest occupied, lowest unoccupied level (ev):     5.9470    7.1541
 1.03727125536023235 1.15624457705622152
EXX divergence (   1)=    -255.1219      0.2381
  ! EXXALFA SET TO  0.250000000000000000
     EXX: now go back to refine exchange calculation
 -0.253504457376716615E+264

     Self-consistent Calculation

     iteration #  1     ecut=    42.00 Ry     beta=0.50
     Davidson diagonalization with overlap
============

Where the E+264 obviously stands out (a quick trial differing only by the usage of
8x8 processors shows a NAN error in this place), so I think it is safe to be
confident the issue is not the diagonalization in the first step of the 2nd scf
loop, but I note the crash remark for completeness.  I am interested in further
testing on this point, and am hoping for some input from the EXX project's
developers.

In case 9, the calculation hangs for a long time at the first scf step of each loop,
but completes normally otherwise.  I did not test the performance under different
diag schemes (cg, ndiag flag, etc).

Case 10 fails just like case 6, which is not shocking.  It appears that one can do
many k and q points for the 2 atom cell, but not for larger cells.  I tested several
vanilla K-S scf calculations (and posted one) with the same pw code, and nothing
troubling popped up for large numbers of k-points.  The same problem is also
produced for PBE0 (tested for 8 atom Si, output overwritten), and is independent of
x_gamma_extrapolation treatment (within the two options easily available).

No systems other than Si were yet tested.

In the spirit of the other EXX examples, I would like to test some simple C-based
isolated molecules and polymers (12-20 atoms), perhaps graphene, etc, but I would
first like to confirm which pseudopotentials from the repository should be used
(i.e. one needs at minimum a norm-conserving PBE psp, and to me the choice from
those available is not obvious - I have always used ultrasolft).

In the EXX examples, I have no idea where the undocumented psps come from (comparing
to the input scripts in atomic, they appear to have been generated via the atomic
code).  Should I perform batteries of EXX tests using this psp, or can you suggest a
different one?  Similar question for sulfur (for one of my planned polymer test
cases).  Also, I notice that a cutoff of 80 has been applied in the examples  - I
will probably test with lower cutoffs (as I did with Si), assuming that no new
problems will appear for higher cutoffs - please let me know if there is sufficient
diagnostic benefit to utilizing more reasonable (i.e. higher) cutoffs - I will
gladly do so.

Joseph Turnbull
North Carolina State University




> On Apr 16, 2010, at 22:05 , josepht at chips.ncsu.edu wrote:
>
>> I am presently testing the code
>
> there are two known problems, with restart and with spin-polarized calculations.
The latter was fixed two days ago:
>    http://qe-forge.org/cgi-bin/cvstrac/q-e/chngview?cn=7557
> Restarting also sort of works, but it is stil not perfect.
>
>> If any of you (or anyone else within the development team) are
>> interested in
>> feedback right now, or if development/diagnostics could be assisted by looking at
particular test cases, just let me know.
>
> I am very interested. Something of general interest is to know
> whether the
> "image" parallelization over k-points still works. In any event, please send your
findings to q-e-developers at qe-forge.org. Thank you
>
> Paolo
> ---
> Paolo Giannozzi, Dept of Physics, University of Udine
> via delle Scienze 208, 33100 Udine, Italy
> Phone +39-0432-558216, fax +39-0432-558222
>
>
>
>














-------------- next part --------------
A non-text attachment was scrubbed...
Name: ssave.tar
Type: application/x-tar
Size: 327680 bytes
Desc: not available
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20100426/a53846b9/attachment.tar>


More information about the developers mailing list