[Pw_forum] problems with running Quantum Espresso in parallel
torstein.fjermestad at unito.it
torstein.fjermestad at unito.it
Tue Apr 22 12:02:37 CEST 2014
Dear Axel Kohlmeyer,
thank you for your fast reply.
Unfortunately, I am not really sure if I have understood all the
details of your suggestions.
The version of mpirun is 1.4.5. This information I got from the output
of "mpirun -V":
mpirun (Open MPI) 1.4.5
I am configuring and compiling pw.x on the same (virtual) machine as
Open MPI is installed. I do this in the following way:
./configure
make pw
I doubt that there are several versions of Open MPI installed on the
same machine, but in case there are, is the version of Open MPI written
somewhere in the output of "make pw"?
I understand that this problem is not a pure Quantum Espresso problem,
and I have therefore also described the problem to the mailing list of
StarCluster.
Thanks in advance for your help.
Yours sincerely,
Torstein Fjermestad
On 17.04.2014 13:03, Axel Kohlmeyer wrote:
> On Thu, Apr 17, 2014 at 6:41 AM, <torstein.fjermestad at unito.it>
> wrote:
>> Dear all,
>>
>> I recently installed QE on a virtual cluster configured with
>> StarCluster.
>> QE configures and compiles without errors. However, when I submit a
>> parallel calculation on 16 processors, the following is written near
>> the
>> start of the output file:
>>
>> Parallel version (MPI), running on 1 processors
>
>> The line is repeated 16 times in the output. To me it seems like I
>> am
>> actually running 16 single processor calculations that all write to
>> the
>> same output file (stdout in this case).
>
> this indicates that you are using an mpirun command that "belongs" to
> a different MPI library than the one the pw.x you are using was
> compiled with. please have a close look at the version output, to see
> if it is really the pw.x you expect to be using. also re-check the
> cluster documentation that you include the correct mpirun matching
> your MPI.
>
>>
>> The way I submit this calculation, is the following:
>>
>> I write the following submit script (submit.sh):
>> cp /path/to/executable/pw.x
>> mpirun -np 16 pw.x -in input.inp
>>
>
> here is a problem: unlike on windows, the current directory is not
> part of the search path, so you would have to use './pw.x' instead of
> 'pw.x'
> to use the pw.x directory in your current working directory (unless
> you have changed your profile to have '.' included in your $PATH
> variable, which is a very, very bad idea).
>
>> Then I submit the job with the following command:
>>
>> qsub -cwd -pe orte 16 ./submit.sh
>>
>> The queuing system of StarCluster is Open Grid Scheduler.
>>
>> For the line in the submit script, I have also tried several
>> alternatives such as:
>>
>> mpirun pw.x -in input.inp
>> mpirun pw.x -inp input.inp
>> mpirun -np 16 pw.x -inp input.inp
>> mpirun -np 16 pw.x < input.inp
>>
>> In the archives of this mailing list I have seen some similar
>> problems,
>> but in spite of this I was still not able to solve my problem.
>>
>> I would appreciate very much if someone could give me suggestions on
>> how to solve the problem.
>>
>> Thanks in advance.
>>
>> Yours sincerely,
>> Torstein Fjermestad
>> University of Turin,
>> Italy
>>
>>
>>
>>
>>
>> _______________________________________________
>> Pw_forum mailing list
>> Pw_forum at pwscf.org
>> http://pwscf.org/mailman/listinfo/pw_forum
More information about the users
mailing list