[Pw_forum] Maximing amount of memory available under Linux/x86
Nicola Marzari
marzari at MIT.EDU
Thu May 13 06:24:53 CEST 2004
Dear Serguei,
many thanks for your comments ! We'll experiment with this.
One poor man's solution that we have found out relies on
just increasing the number of processes on a single CPU.
Let me explain - our first test involves running on 4 CPUs, each of
them with 4 GB of physical ram, and suffering from the Linux limit
of 2 GB per process. For our test system (a large molecule) we
can run on those 4 CPUs with mpirun -np 4 only if we keep the
cutoff downt to 15 Ry; a larger cutoff means that the processes
hit the 2 GB allocation ceiling. But if we run on those same 4
CPUs with mpirun -np 8, we can go up to a 20 Ry cutoff, and
with mpirun -np 12 we can go up to 25 Ry.
Performance doesn't seem to degrade much - here are some other
tests on a bulk system with ~100 atoms, run on one single CPU,
with a physical RAM (DDR3200) of 2GB, and using mpirun -np X,
with X=1,2,3,4 . In the ideal case, all jobs should take the same
time. Wall clock time reported on an empty machine is:
np size per process time(5 iter) (10 iter) (10-5)
1 1721 MB 3084s 5191s 2107s
2 1007 MB 3251s 5580s 2329s
2 1007 MB 3347s 5591s 2247s
3 782 MB 4020s 6463s 2443s
3 782 MB 3556s 5942s 2386s
3 782 MB 3812s 6159s 2347s
4 649 MB 3-6000s 9-17000s 5-10000s
Using mpirun -np 4 the computer starts swapping a lot, and
performance greatly decreases. Otherwise, it looks OK; the
difference in wall time between 10 iterations and 5 iterations
increases only by 10-15%.
(BTW, all these tests were done with the Democritos CP code, not
PWSCF).
Best,
nicola
Serguei Patchkovskii wrote:
> Hi folks,
>
> Since the last time the question on maximizing the amount of
> RAM available for dynamic allocation on Linux/x86 systems came
> up, I had a chance to revisit the issue. As it turns out,
> recent Linux kernels provide a very convenient way of fiddling
> with some of the kernel parameters, which required a kernel
> rebuild before. As of 2.4.21 kernels, TASK_UNMAPPED_BASE can
> be adjusted on a per-process basis. So, now programs which
> need to maximize the amount of ALLOCATE'able memory can get
> it, even with stock kernels.
---------------------------------------------------------------------
Prof Nicola Marzari Department of Materials Science and Engineering
13-5066 MIT 77 Massachusetts Avenue Cambridge MA 02139-4307 USA
tel 617.4522758 fax 617.2586534 marzari at mit.edu http://nnn.mit.edu
More information about the users
mailing list