[Q-e-developers] Questions About Optimization and Parallelism of Quantum Espresso
Ye Luo
xw111luoye at gmail.com
Thu Oct 20 02:44:53 CEST 2016
Dear Carlo,
It is very enlightening to read your comment on architecture and QE.
Do you have more recent technical talks about the re-factoring of QE?
My knowledge is still limited to your talks during 2012 when BG/Q is
introduced and QE dev meetings slides in 2014/2015.
Having more advanced libraries definitely helps the performance but
changing the up level can probably be more beneficial.
I'm curious to know more recent advances in QE about changing internal data
layout (mainly the wavefunction distribution) for better parallelization.
Has the task group been more general rather than only FFT? Merging band
group and task group if possbile...
Though the hybrid functional or GW can be implemented with better parallel
efficiency, the plain DFT part can be a severe bottle neck in the
computation.
Best,
Ye
===================
Ye Luo, Ph.D.
Leadership Computing Facility
Argonne National Laboratory
2016-10-19 2:53 GMT-05:00 Carlo Cavazzoni <c.cavazzoni at cineca.it>:
> Dear Sergio,
>
> we are obviously addressing al that issues on different architectures with
> different vendors,
> and here it come the point, architectures are not converging!
> As you know there are two main basic designs Homogeneous and Heterogeneous
> (a.k.a. accelerated),
> with some, like Intel, that oscillate between both (KNC Heterogeneous,
> KNL Homogeneous,
> and recently announced Stratix X FPGA Heterogeneous again).
> Is not that easy to have a code coping with all of them in an effective
> way, especially because
> some of the best tools for new architectures are not standard (CUDA) and
> this is a real pity,
> and make me complain with Nvidia all the time for them not supporting a
> standard paradigm
> (yes I know, there is OpenACC, new OpenMP feature, OpenCL ... but CUDA
> remains by far more effective),
> this is a sort of disruption for community of developers like ourself.
>
> Nevertheless to reduce this complexity we recently encapsulate the two
> main computational kernels
> (parallel FFT and Linear Algebra) into self contained libraries (FFTXlib
> and LAXlib) including a small
> app (please read README files included in the two library) that allow one
> to experiment and
> best tune all the parameters for parallelization, vectorization, tasking
> etc..).
>
> To play with the two libraries you need to know very little about the
> physics of the QE,
> and are the ideal for persons like you that need to look into optimization
> stuff.
> In particular any improvements in these two libraries are immediately
> transferred into the QE
> main codes (and other as well).
>
> If you want to know more about our next developments, we are working with
> non blocking MPI collectives and task based parallelism to try to overlap
> communications and computations within the FFT.
> Most recent (not production) advancements in FFT lib could be found at:
>
> https://github.com/fabioaffinito/FFTXlib
>
>
>
> Another interesting exercise could be to review the LAXlib following
> closely the advancement in dense linear algorithms promoted by Dongarra et
> all
> http://insidehpc.com/2016/10/jack-dongarra-presents-
> adaptive-linear-solvers-and-eigensolvers/
>
> From the point of view of the paradigms we are supporting open initiatives,
> especially in close collaboration with BSC and different standardization
> committees (like OpenMP),
> or the recently announced effort promoted by AMD to open source software
> and drivers
> for heterogeneous architectures: https://radeonopencompute.github.io/
>
>
> best,
> carlo
>
>
>
> Il 18/10/2016 16:14, Sérgio Caldas ha scritto:
>
> Hi!
>
> I'm Sérgio Caldas, an MSc student in Informatics Engineering at University
> of Minho, Braga, Portugal. The key area of specialisation during my
> master courses were on parallel computing, with a strong focus on efficient
> & performance engineering on heterogeneous systems. For my master thesis
> the theme applies these competences to computational physics, where I’m
> supposed to help a senior physics researcher to tune his work on the
> determination of electronic and optical properties of materials, using
> Quantum Espresso tool in our departamental cluster. This cluster has nodes
> with several generations of dual multicore Xeons and some nodes with Xeon
> Phi (both KNC and KNL) and GPUs (both Fermi and Kepler, and soon Pascal).
>
> I have some queries on the QE, namely how far QE development has reached
> in these areas (vectorisation, data/task parallelism on both
> shared/distributed memory, data locality).
>
> For example:
> - QE is already exploring vector operations (AVX/AVX-2 or AVX-512)?
> - the tool is ready for multicore / many-core devices?
> - how is the scheduling between multicore-devices and the accelerator
> devices, such that both type of devices are simultaneously used?
> - for distributed memory, the tool is already taking advantage of
> low-latency interconnection topologies, such as Myrinet or Infiniband?
> - how can I have access to beta versions where this advanced capabilities
> are being explored?
> - do you have suggestions of areas that still need to be improved, so
> that I can address those areas and improve both the quality of my work and
> the overall QE performance?
>
> I would also be grateful if you could suggest documentation (preferably
> papers) to get some of these answers or any other documentation to
> complement my knowledge on QE.
>
> Thanking you in advance, yours sincerely
> Sergio Caldas
>
>
> _______________________________________________
> Q-e-developers mailing listQ-e-developers at qe-forge.orghttp://qe-forge.org/mailman/listinfo/q-e-developers
>
>
>
> --
> Ph.D. Carlo Cavazzoni
> SuperComputing Applications and Innovation Department
> CINECA - Via Magnanelli 6/3, 40033 Casalecchio di Reno (Bologna)
> Tel: +39 051 6171411 Fax: +39 051 6132198www.cineca.it
>
>
> _______________________________________________
> Q-e-developers mailing list
> Q-e-developers at qe-forge.org
> http://qe-forge.org/mailman/listinfo/q-e-developers
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20161019/f8040920/attachment.html>
More information about the developers
mailing list