[QE-users] Optimal pw command line for large systems and only Gamma point
Antonio Cammarata
cammaant at fel.cvut.cz
Fri May 10 12:18:23 CEST 2024
Thanks to both Paolo and Giuseppe for your answers.
I cannot reduce the size of the system as I need to consider a large
nanocluster; also, as I mentioned, I will need to consider even larger
systems. All the systems are not periodic, so I will always have only
the Gamma point.
I submitted the calculation with 12 nodes and 12 cores per node and it
didn't stop because of the memory issue as usual, so I believe that this
is the right route to follow. Also, I used no options in the command
line at the moment as suggested by Giuseppe. Now the question is how to
find the optimal number of cores and the right command line options to
speed-up the calculation but without having memory issues.
Thanks a lot in advance for your suggestions on this.
All the best
Antonio
Il 10. 05. 24 12:01, Paolo Giannozzi ha scritto:
> On 5/10/24 08:58, Antonio Cammarata via users wrote:
>
>> pw.x -nk 1 -nt 1 -nb 1 -nd 768 -inp qe.in > qe.out
>
> too many processors for linear-algebra parallelization. 1000 Si atoms
> = 2000 bands (assuming an insulator with no spin polarization). Use a
> few tens of processors at most
>
>> "some processors have no G-vectors for symmetrization".
>
> which sounds strange to me: with the Gamma point symmetrization is not
> even needed
>
>
>> Dense grid: 30754065 G-vectors FFT dimensions: ( 400, 400, 400)
>
> This is what a 256-atom Si supercell with 30 Ry cutoff yields:
>
> Dense grid: 825897 G-vectors FFT dimensions: ( 162, 162, 162)
>
> I guess you may reduce the size of your supercell
>
> Paolo
>
>> Dynamical RAM for wfc: 153.50 MB
>> Dynamical RAM for wfc (w. buffer): 153.50 MB
>> Dynamical RAM for str. fact: 0.61 MB
>> Dynamical RAM for local pot: 0.00 MB
>> Dynamical RAM for nlocal pot: 1374.66 MB
>> Dynamical RAM for qrad: 0.87 MB
>> Dynamical RAM for rho,v,vnew: 5.50 MB
>> Dynamical RAM for rhoin: 1.83 MB
>> Dynamical RAM for rho*nmix: 9.78 MB
>> Dynamical RAM for G-vectors: 2.60 MB
>> Dynamical RAM for h,s,v(r/c): 0.25 MB
>> Dynamical RAM for <psi|beta>: 552.06 MB
>> Dynamical RAM for wfcinit/wfcrot: 977.20 MB
>> Estimated static dynamical RAM per process > 1.51 GB
>> Estimated max dynamical RAM per process > 2.47 GB
>> Estimated total dynamical RAM > 1900.41 GB
>>
>> I managed to run the simulation with 512 atoms, cg diagonalization
>> and 3 nodes on the same machine with command line
>>
>> pw.x -nk 1 -nt 1 -nd 484 -inp qe.in > qe.out
>>
>> Please, do you have any suggestion on how to set optimal
>> parallelization parameters to avoid the memory issue and run the
>> calculation? I am also planning to run simulations on nanoclusters
>> with more than 1000 atoms.
>>
>> Thanks a lot in advance for your kind help.
>>
>> Antonio
>>
>>
>
--
_______________________________________________
Antonio Cammarata, PhD in Physics
Associate Professor in Applied Physics
Advanced Materials Group
Department of Control Engineering - KN:G-204
Faculty of Electrical Engineering
Czech Technical University in Prague
Karlovo Náměstí, 13
121 35, Prague 2, Czech Republic
Phone: +420 224 35 5711
Fax: +420 224 91 8646
ORCID: orcid.org/0000-0002-5691-0682
ResercherID: A-4883-2014
More information about the users
mailing list