[Pw_forum] [Pw_Forum] Parallel execution and configuration issues
Konrad Gruszka
kgruszka at wip.pcz.pl
Mon Sep 19 12:01:03 CEST 2016
Dear colleagues,
I'm Writing to our community to clarify once and for all issues with
parallel pwscf execution.
By lecture of PWSCF manual, I know that there are several levels of
parallelization, including FFTW and more. At the same time, I'm quite
sure that most of beginner QE users do not have complicated clusters,
but are rather starting with decent multicore PC's.
There are two main parallel mechanisms (known to me) namely: MPI and
OpenMP. Because I'm using Ubuntu, there is free Hydra MPI software
(mpich) and also free OpenMP. From what I know (please correct me if
this is wrong) MPI software is used for parallelization within one
multicore PC and OpenMP is used to make parallelization between several
PC's.
So my reasoning is: because I have 2 multicore PCs (namely PC1: 6 cores
2 threads, and PC2: 4 cores 2 threads, giving in overall PC1: 12 and PC2
8 'cores') on both PC1 and PC2 I've installed hydra MPI (for
parallelization within multicores) and on both there are OpenMP for
communication between PC1 and PC2 to act as 'cluster'.
I've configured OpenMP so both machines sees one another and when I'm
starting some pw.x job using openmp on both machines i see that
processors are 100% busy. Problem is that in produced output files I can
see several repeats of pwscf output (just like several separate pwscf
processes write into one file at the same time).
The QE was configured and compiled with --enable-openmp on both machines
and when I simply run pw.x on PC1 it writes:
"Parallel version (MPI & OpenMP), running on 12 processor cores"
I'm running job with this command:
mpirun.openmp -np 20 pw.x -in job.in > job.out
and as said before, both machines are occupied with this job (tested
using htop).
Looking for help in this topic I often get to the truth 'please contact
your system administrator', but sadly I am the administrator!
Could some one with a similar hardware configuration could comment here
how to achieve properly working cluster?
Sincerely yours
Konrad Gruszka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20160919/4b777cc5/attachment.html>
More information about the users
mailing list