[Pw_forum] newbie update

Carlo Nervi carlo.nervi at unito.it
Thu Mar 5 13:03:23 CET 2009


> but this does NOT work:
> OMP_NUM_THREAD=1
> ./myprogram
>
> mysteries of bash shell...
>
I think this is due to the fact that bash temporarely set
the variables for the single command...

Anyway, I did few tests, as suggested by Axel, on 32 water
molecule (example 21).
I started to play with the version 4.1 and forgot that it
is under testing.
The following results refers to the version 4.1CVS ...
sorry...
I did two exacutables, v0 is without -ipo, v1 is compiled
with -ipo
I do have to try to set the sequential, but in my mkl lib
it does not exists any lib-sequential...
therefore I'm going to try to use some compiler switches
Here are the results (read OMP as OMP_NUM_THREADS and MPI
as number of cpu used with mpiexec):

The test using OMP=8 and no MPI:
v0:    CP           :  4m 1.98s CPU time,     3m 0.83s
wall time
v1:    CP           :  3m49.07s CPU time,     2m57.33s
wall time

Using OMP=1 and MPI=8
v0:    CP           :  0m36.43s CPU time,     1m31.59s
wall time
v1:    CP           :  0m35.39s CPU time,     1m30.67s
wall time

There is a clear advantage in using MPI, even though each
%CPU (with MPI) is only between 40-50%. In fact the CPU
time is almost 1/3 of the wall time. I suspect that Axel
is right pointing to the disk I/O. I'm afraid that to
increase the performance I should set the RAID 10...

OMP=2, MPI=8
v0:      CP           :  0m46.98s CPU time,     1m36.08s
wall time
little worse than OMP=1 and MPI=8

OMP=4. MPI=4
v0:     CP           :  1m25.82s CPU time,     2m 4.96s
wall time

OMP=8, MPI=8
v0:     CP           :  1m17.36s CPU time,     2m11.83s
wall time
Here there is probably some interferences between parallel
approaches.


I did also a couple of run on 64 water molecules:

OMP=1, MPI=8
v0:     CP           :  3m41.49s CPU time,     7m14.67s
wall time

OMP=8, no MPI
v0:     CP           : 15m27.18s CPU time,    12m56.67s
wall time

In this case the ratio CPU time/wall time is slightly
better, and in fact the %CPU usage were between 50-60%

Furthermore, I did also the tests/check-pw.x.j on the
version v1 only:

./check-pw.x.j
Checking atom-lsda...passed
Checking atom-pbe...discrepancy in number of scf
iterations detected
Reference: 7, You got: 5
discrepancy in pressure detected
Reference: -14.44, You got: -14.48
Checking atom-sigmapbe...discrepancy in pressure detected
Reference: -15.02, You got: -15.01
Checking atom...passed
Checking berry...passed
Checking berry, step 2 ...passed
Checking electric0...discrepancy in number of scf
iterations detected
Reference: 8, You got: 9
Checking electric1...passed
Checking electric2...passed
Checking eval_infix...passed
Checking eval_infix, step 2 ...discrepancy in HOMO detected
Reference: -8.4542, You got: -8.4554
discrepancy in LUMO detected
Reference: -0.4297, You got: -0.4300
Checking lattice-ibrav0-abc...passed
Checking lattice-ibrav0-cell_parameters+a...passed
Checking lattice-ibrav0-cell_parameters+celldm...passed
Checking lattice-ibrav0-cell_parameters...passed
Checking lattice-ibrav1-kauto...passed
Checking lattice-ibrav1...passed
Checking lattice-ibrav10-kauto...passed
Checking lattice-ibrav10...passed
Checking lattice-ibrav11-kauto...passed
Checking lattice-ibrav11...passed
Checking lattice-ibrav12-kauto...passed
Checking lattice-ibrav12...passed
Checking lattice-ibrav13-kauto...passed
Checking lattice-ibrav13...passed
Checking lattice-ibrav14-kauto...passed
Checking lattice-ibrav14...passed
Checking lattice-ibrav2-kauto...passed
Checking lattice-ibrav2...passed
Checking lattice-ibrav3-kauto...passed
Checking lattice-ibrav3...passed
Checking lattice-ibrav4-kauto...passed
Checking lattice-ibrav4...passed
Checking lattice-ibrav5-kauto...passed
Checking lattice-ibrav5...passed
Checking lattice-ibrav6-kauto...passed
Checking lattice-ibrav6...passed
Checking lattice-ibrav7-kauto...passed
Checking lattice-ibrav7...passed
Checking lattice-ibrav8-kauto...passed
Checking lattice-ibrav8...passed
Checking lattice-ibrav9-kauto...passed
Checking lattice-ibrav9...passed
Checking lda+U-noU...passed
Checking lda+U-user_ns...passed
Checking lda+U...passed
Checking lsda-cg...passed
Checking lsda-mixing_TF...passed
Checking lsda-mixing_localTF...passed
Checking lsda-mixing_ndim...passed
Checking lsda-nelup+neldw...passed
Checking lsda-tot_magnetization...passed
Checking lsda...passed
Checking lsda, step 2 ...passed
Checking md-pot_extrap1...passed
Checking md-pot_extrap2...passed
Checking md-wfc_extrap1...passed
Checking md-wfc_extrap2...passed
Checking md...passed
Checking metaGGA...passed
Checking metadyn...passed
Checking metal-fermi_dirac...passed
Checking metal-gaussian...passed
Checking metal-tetrahedra...passed
Checking metal-tetrahedra, step 2 ...passed
Checking metal...passed
Checking metal, step 2 ...passed
Checking neb1-H2+H...passed
Checking neb2-H2+H-symm...passed
Checking neb3-H2+H-asym...passed
Checking noncolin-cg...passed
Checking noncolin-constrain_angle...passed
Checking noncolin-constrain_atomic...discrepancy in total
energy detected
Reference:   -55.690284, You got:   -55.690283
discrepancy in number of scf iterations detected
Reference: 12, You got: 14
Checking noncolin-constrain_total...discrepancy in total
energy detected
Reference:   -55.544783, You got:   -55.544784
discrepancy in number of scf iterations detected
Reference: 32, You got: 30
Checking noncolin...discrepancy in pressure detected
Reference: 193.22, You got: 193.53
Checking noncolin, step 2 ...passed
Checking paw-atom...passed
Checking paw-atom_l=2...passed
Checking paw-atom_lda...passed
Checking paw-atom_spin...discrepancy in total energy detected
Reference:   -41.264991, You got:   -41.265001
discrepancy in number of scf iterations detected
Reference: 6, You got: 5
Checking paw-atom_spin_lda...discrepancy in total energy
detected
Reference:   -40.244090, You got:   -40.244091
Checking paw-bfgs...discrepancy in number of scf
iterations detected
Reference: 7, You got: 8
Checking paw-vcbfgs...passed
Checking relax-damped...passed
Checking relax-el...passed
Checking relax...passed
Checking relax2-bfgs_ndim3...passed
Checking relax2...passed
Checking scf-cg...passed
Checking scf-disk_io...passed
Checking scf-gamma...passed
Checking scf-k0...passed
Checking scf-kauto...passed
Checking scf-mixing_TF...passed
Checking scf-mixing_beta...passed
Checking scf-mixing_localTF...passed
Checking scf-mixing_ndim...passed
Checking scf-ncpp...discrepancy in total energy detected
Reference:   -15.839765, You got:   -15.839767
Checking scf-wf_collect...passed
Checking scf...passed
Checking scf, step 2 ...passed
Checking spinorbit...passed
Checking spinorbit, step 2 ...passed
Checking uspp-cg...passed
Checking uspp-mixing_TF...passed
Checking uspp-mixing_localTF...passed
Checking uspp-mixing_ndim...passed
Checking uspp-singlegrid...passed
Checking uspp...passed
Checking uspp, step 2 ...passed
Checking uspp1-coulomb...passed
Checking uspp1...passed
Checking uspp2...discrepancy in pressure detected
Reference: -30.68, You got: -30.69
Checking vc-relax1...passed
Checking vc-relax2...passed
Total wall time (s) spent in this run:  5274.73
Reference                            :  720.04

There are few discrepancies, but they seems reasonables..
or not?
It's only very long...
   Carlo
P.S.: I hope I did not post a message too much long...

------------------------------------------------------
Carlo Nervi carlo.nervi at unito.it Tel:+39 011 6707507/8
Fax: +39 011 6707855 - Dipartimento di Chimica IFM
via P. Giuria 7, 10125 Torino, Italy
http://lem.ch.unito.it/




More information about the users mailing list