[Pw_forum] Problem with QE 4.2.1 and AMD Opteron 6200 / 6300

Thu Dec 12 11:20:15 CET 2013

Dear Fabricio,

I reckon there is some inconsistency in the results you are obtaining.
The AMD 6380 is a 16-core model. I'm wondering how do you map the 
process affinity while running on 8 cores.
Without controlling such mapping you can obtain substantial performance 
variation at each execution.
Indeed, cache memory and FPU are shared among a given set of cores and 
the increasing concurrency on shared resources goes along with a 
degradation of the performances.
The AMD 6380 also supports the AVX instruction set extension for vector 
operations at 256bit. Does your O.S. support that too?
Compile a simple source with -mavx and see whether you can run it. Or 
check if the "avx" flag is present in your /proc/cpuinfo.

For my experience about benchmarking QE on the same CPU system, the 
combination of the Intel compiler + MKL turned out to be always the best 
option.

Regards,

Ivan

On 11/12/2013 19:52, Fabricio Cannini wrote:
> Em 11-12-2013 15:51, Paolo Giannozzi escreveu:
>> >  On Tue, 2013-12-10 at 19:49 -0200, Fabricio Cannini wrote:
>>> >>  Em 10-12-2013 18:34, Paolo Giannozzi escreveu:
>>>> >>>  First of all you should verify if multi-threading libraries
>>>> >>>  are conflicting with MPI parallelization.
>>> >>
>>> >>  Yes, i did look into it already.
> Hi there
>
>
> So, what else can I look into ?
> I did more tests, on the same Opteron 6380 machine, using the same
> binaries, but now using the "DEISA medium benchmark" and the results
> were interesting.
>
> http://qe-forge.org/gf/project/q-e/frs/?action=FrsReleaseView&release_id=45
>
>
> ifort 13.2 + mkl 11.0 / 8 cores		= 1h8m
> gfortran 4.6 + openblas 0.2.8 / 8 cores	= 46m57.62s
>
>
>
> This is making me even more suspicious of intel compiler being the problem.
>
> TIA,
> Fabricio