<p dir="ltr">Hello Nicola</p>

<p dir="ltr">I am very interested in that aiida workflow.</p>

<p dir="ltr">Thx much for the offer to share it.</p>

<p dir="ltr">Kirk<br>

</p>

<div class="gmail_extra"><br><div class="gmail_quote">On Sep 20, 2016 4:05 AM, "nicola varini" <<a href="mailto:nicola.varini@epfl.ch">nicola.varini@epfl.ch</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear Konrad, as Alex suggested the MPI only configuration might be the<br>

best in your case.<br>

However, if you would like to run on bigger machines or newer hardware<br>

like KNL processor it's important to understand how OpenMP performs as well.<br>

To this end, by using the AiiDA software(<a href="http://www.aiida.net/" rel="noreferrer" target="_blank">http://www.aiida.net/</a><wbr>), I<br>

created a workflow that do a series<br>

of benchmark and compare the performance of  different combination<br>

MPI+OpenMP by splitting<br>

the communicators following a certain logic.<br>

At the moment this is not present in the AiiDA release, however if there<br>

is any interest I am more than<br>

happy to share.<br>

<br>

Nicola<br>

<br>

<br>

On 20.09.2016 00:25, Konrad Gruszka wrote:<br>

> Dear Stefano and Axel,<br>

><br>

> Thank you very much for your answers. In fact, the 'silly' mistake was<br>

> in confusing OpenMP with OpenMPI! (just like Axel suggested) - now it<br>

> seems so clear.<br>

> I'll now stick with OpenMPI or Hydra only (both did work fine within one<br>

> machine) and just try to configure it properly so it can 'see' both hosts.<br>

><br>

> Sometimes just few words from experts can do real magic.<br>

><br>

> Konrad.<br>

><br>

> W dniu 2016-09-19 o 20:17, Axel Kohlmeyer pisze:<br>

>> On Mon, Sep 19, 2016 at 6:01 AM, Konrad Gruszka <<a href="mailto:kgruszka@wip.pcz.pl">kgruszka@wip.pcz.pl</a>> wrote:<br>

>>> Dear colleagues,<br>

>>><br>

>>> I'm Writing to our community to clarify once and for all issues with<br>

>>> parallel pwscf execution.<br>

>>><br>

>>> By lecture of PWSCF manual, I know that there are several levels of<br>

>>> parallelization, including FFTW and more. At the same time, I'm quite sure<br>

>>> that most of beginner QE users do not have complicated clusters, but are<br>

>>> rather starting with decent multicore PC's.<br>

>>><br>

>>> There are two main parallel mechanisms (known to me) namely:  MPI and<br>

>>> OpenMP. Because I'm using Ubuntu, there is free Hydra MPI software (mpich)<br>

>> as far as i know, ubuntu ships with *two* MPI package OpenMPI (not<br>

>> OpenMP) and MPICH and OpenMPI is the default MPI package.<br>

>> OpenMP is not a specific package, but a compiler feature (and included in GCC).<br>

>><br>

>>> and also free OpenMP. From what I know (please correct me if this is wrong)<br>

>>> MPI software is used for parallelization within one multicore PC and OpenMP<br>

>>> is used to make parallelization between several PC's.<br>

>> that is incorrect. OpenMP is *restricted* to multi-core CPUs, because<br>

>> it is based on multi-threading.<br>

>> MPI (i.e. either OpenMPI or MPICH, but not both at the same time) can<br>

>> be used for intra-node or inter-node or a combination of both.<br>

>> in many cases, using only MPI is the most effective parallelization<br>

>> approach. only when you have many cores per node and communication<br>

>> contention become a performance limiting factor, you will run faster<br>

>> with MPI+OpenMP<br>

>><br>

>><br>

>>> So my reasoning is: because I have 2 multicore PCs (namely PC1: 6 cores 2<br>

>>> threads, and PC2: 4 cores 2 threads, giving in overall PC1: 12 and PC2 8<br>

>>> 'cores') on both PC1 and PC2 I've installed hydra MPI (for parallelization<br>

>>> within multicores) and on both there are OpenMP for communication between<br>

>>> PC1 and PC2 to act as 'cluster'.<br>

>> you seem to be confusing OpenMPI with OpenMP. for your system, an<br>

>> MPI-only configuration (using the default MPI, i.e. OpenMPI) should<br>

>> work well. no need to worry about the additional complications from<br>

>> OpenMP until you have a larger, more potent machine.<br>

>><br>

>>> I've configured OpenMP so both machines sees one another and when I'm<br>

>>> starting some pw.x job using openmp on both machines i see that processors<br>

>>> are 100% busy. Problem is that in produced output files I can see several<br>

>>> repeats of pwscf output (just like several separate pwscf processes write<br>

>>> into one file at the same time).<br>

>> that is an indication of compiling executable without MPI support or<br>

>> with support for an MPI library *different* from the one used when<br>

>> launching the executable.<br>

>><br>

>>> The QE was configured and compiled with --enable-openmp on both machines and<br>

>>> when I simply run pw.x on PC1 it writes:<br>

>>><br>

>>>     "Parallel version (MPI & OpenMP), running on      12 processor cores"<br>

>>><br>

>>> I'm running job with this command:<br>

>>><br>

>>>    mpirun.openmp -np 20 pw.x -in <a href="http://job.in" rel="noreferrer" target="_blank">job.in</a> > job.out<br>

>>><br>

>>> and as said before, both machines are occupied with this job (tested using<br>

>>> htop).<br>

>>><br>

>>> Looking for help in this topic I often get to the truth 'please contact your<br>

>>> system administrator', but sadly I am the administrator!<br>

>> then look for somebody with experience in linux clusters. you<br>

>> definitely want somebody local to help you, that can look over your<br>

>> shoulder while you are doing things and explain stuff. this is really<br>

>> tricky to do over e-mail. these days it should not be difficult to<br>

>> find somebody, as using linux in parallel is ubiquitous in most<br>

>> computational science groups.<br>

>><br>

>> there also is a *gazillion* of material available online including<br>

>> talk slides from workshops and courses and online courses and<br>

>> self-study material. HPC university has an aggregator for such<br>

>> material at:<br>

>> <a href="http://hpcuniversity.org/trainingMaterials/" rel="noreferrer" target="_blank">http://hpcuniversity.org/<wbr>trainingMaterials/</a><br>

>><br>

>> axel.<br>

>><br>

>>> Could some one with a similar hardware configuration could comment here how<br>

>>> to achieve properly working cluster?<br>

>>><br>

>>> Sincerely yours<br>

>>><br>

>>> Konrad Gruszka<br>

>>><br>

>>><br>

>>> ______________________________<wbr>_________________<br>

>>> Pw_forum mailing list<br>

>>> <a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

>>> <a href="http://pwscf.org/mailman/listinfo/pw_forum" rel="noreferrer" target="_blank">http://pwscf.org/mailman/<wbr>listinfo/pw_forum</a><br>

>><br>

<br>

--<br>

Nicola Varini, PhD<br>

<br>

Scientific IT and Application Support (SCITAS)<br>

Theory and simulation of materials (THEOS)<br>

CE 0 813 (Bâtiment CE)<br>

Station 1<br>

CH-1015 Lausanne<br>

<a href="http://scitas.epfl.ch" rel="noreferrer" target="_blank">http://scitas.epfl.ch</a><br>

<br>

Nicola Varini<br>

<br>

______________________________<wbr>_________________<br>

Pw_forum mailing list<br>

<a href="mailto:Pw_forum@pwscf.org">Pw_forum@pwscf.org</a><br>

<a href="http://pwscf.org/mailman/listinfo/pw_forum" rel="noreferrer" target="_blank">http://pwscf.org/mailman/<wbr>listinfo/pw_forum</a></blockquote></div></div>