[Pw_forum] [Pw_Forum] Parallel execution and configuration issues

Axel Kohlmeyer akohlmey at gmail.com
Mon Sep 19 20:17:15 CEST 2016


On Mon, Sep 19, 2016 at 6:01 AM, Konrad Gruszka <kgruszka at wip.pcz.pl> wrote:
> Dear colleagues,
>
> I'm Writing to our community to clarify once and for all issues with
> parallel pwscf execution.
>
> By lecture of PWSCF manual, I know that there are several levels of
> parallelization, including FFTW and more. At the same time, I'm quite sure
> that most of beginner QE users do not have complicated clusters, but are
> rather starting with decent multicore PC's.
>
> There are two main parallel mechanisms (known to me) namely:  MPI and
> OpenMP. Because I'm using Ubuntu, there is free Hydra MPI software (mpich)

as far as i know, ubuntu ships with *two* MPI package OpenMPI (not
OpenMP) and MPICH and OpenMPI is the default MPI package.
OpenMP is not a specific package, but a compiler feature (and included in GCC).

> and also free OpenMP. From what I know (please correct me if this is wrong)
> MPI software is used for parallelization within one multicore PC and OpenMP
> is used to make parallelization between several PC's.

that is incorrect. OpenMP is *restricted* to multi-core CPUs, because
it is based on multi-threading.
MPI (i.e. either OpenMPI or MPICH, but not both at the same time) can
be used for intra-node or inter-node or a combination of both.
in many cases, using only MPI is the most effective parallelization
approach. only when you have many cores per node and communication
contention become a performance limiting factor, you will run faster
with MPI+OpenMP


> So my reasoning is: because I have 2 multicore PCs (namely PC1: 6 cores 2
> threads, and PC2: 4 cores 2 threads, giving in overall PC1: 12 and PC2 8
> 'cores') on both PC1 and PC2 I've installed hydra MPI (for parallelization
> within multicores) and on both there are OpenMP for communication between
> PC1 and PC2 to act as 'cluster'.

you seem to be confusing OpenMPI with OpenMP. for your system, an
MPI-only configuration (using the default MPI, i.e. OpenMPI) should
work well. no need to worry about the additional complications from
OpenMP until you have a larger, more potent machine.

> I've configured OpenMP so both machines sees one another and when I'm
> starting some pw.x job using openmp on both machines i see that processors
> are 100% busy. Problem is that in produced output files I can see several
> repeats of pwscf output (just like several separate pwscf processes write
> into one file at the same time).

that is an indication of compiling executable without MPI support or
with support for an MPI library *different* from the one used when
launching the executable.

> The QE was configured and compiled with --enable-openmp on both machines and
> when I simply run pw.x on PC1 it writes:
>
>   "Parallel version (MPI & OpenMP), running on      12 processor cores"
>
> I'm running job with this command:
>
>  mpirun.openmp -np 20 pw.x -in job.in > job.out
>
> and as said before, both machines are occupied with this job (tested using
> htop).
>
> Looking for help in this topic I often get to the truth 'please contact your
> system administrator', but sadly I am the administrator!

then look for somebody with experience in linux clusters. you
definitely want somebody local to help you, that can look over your
shoulder while you are doing things and explain stuff. this is really
tricky to do over e-mail. these days it should not be difficult to
find somebody, as using linux in parallel is ubiquitous in most
computational science groups.

there also is a *gazillion* of material available online including
talk slides from workshops and courses and online courses and
self-study material. HPC university has an aggregator for such
material at:
http://hpcuniversity.org/trainingMaterials/

axel.

>
> Could some one with a similar hardware configuration could comment here how
> to achieve properly working cluster?
>
> Sincerely yours
>
> Konrad Gruszka
>
>
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum



-- 
Dr. Axel Kohlmeyer  akohlmey at gmail.com  http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.



More information about the users mailing list