[Pw_forum] [Pw_Forum] Parallel execution and configuration issues

Konrad Gruszka kgruszka at wip.pcz.pl
Mon Sep 19 12:01:03 CEST 2016


Dear colleagues,

I'm Writing to our community to clarify once and for all issues with 
parallel pwscf execution.

By lecture of PWSCF manual, I know that there are several levels of 
parallelization, including FFTW and more. At the same time, I'm quite 
sure that most of beginner QE users do not have complicated clusters, 
but are rather starting with decent multicore PC's.

There are two main parallel mechanisms (known to me) namely:  MPI and 
OpenMP. Because I'm using Ubuntu, there is free Hydra MPI software 
(mpich) and also free OpenMP. From what I know (please correct me if 
this is wrong) MPI software is used for parallelization within one 
multicore PC and OpenMP is used to make parallelization between several 
PC's.

So my reasoning is: because I have 2 multicore PCs (namely PC1: 6 cores 
2 threads, and PC2: 4 cores 2 threads, giving in overall PC1: 12 and PC2 
8 'cores') on both PC1 and PC2 I've installed hydra MPI (for 
parallelization within multicores) and on both there are OpenMP for 
communication between PC1 and PC2 to act as 'cluster'.

I've configured OpenMP so both machines sees one another and when I'm 
starting some pw.x job using openmp on both machines i see that 
processors are 100% busy. Problem is that in produced output files I can 
see several repeats of pwscf output (just like several separate pwscf 
processes write into one file at the same time).

The QE was configured and compiled with --enable-openmp on both machines 
and when I simply run pw.x on PC1 it writes:

   "Parallel version (MPI & OpenMP), running on      12 processor cores"

I'm running job with this command:

  mpirun.openmp -np 20 pw.x -in job.in > job.out

and as said before, both machines are occupied with this job (tested 
using htop).

Looking for help in this topic I often get to the truth 'please contact 
your system administrator', but sadly I am the administrator!

Could some one with a similar hardware configuration could comment here 
how to achieve properly working cluster?

Sincerely yours

Konrad Gruszka

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/users/attachments/20160919/4b777cc5/attachment.html>


More information about the users mailing list