[Pw_forum] Maximing amount of memory available under Linux/x86

Nicola Marzari marzari at MIT.EDU
Thu May 13 06:24:53 CEST 2004



Dear Serguei,


many thanks for your comments ! We'll experiment with this.

One poor man's solution that we have found out relies on
just increasing the number of processes on a single CPU.

Let me explain - our first test involves running on 4 CPUs, each of
them with 4 GB of physical ram, and suffering from the Linux limit
of 2 GB per process. For our test system (a large molecule) we
can run on those 4 CPUs with mpirun -np 4 only if we keep the
cutoff downt to 15 Ry; a larger cutoff means that the processes
hit the 2 GB allocation ceiling. But if we run on those same 4
CPUs with mpirun -np 8, we can go up to a 20 Ry cutoff, and
with mpirun -np 12 we can go up to 25 Ry.

Performance doesn't seem to degrade much - here are some other
tests on a bulk system with ~100 atoms, run on one single CPU,
with a physical RAM (DDR3200) of 2GB, and using mpirun -np X,
with X=1,2,3,4 . In the ideal case, all jobs should take the same
time. Wall clock time reported on an empty machine is:

np    size per process   time(5 iter) (10 iter)   (10-5)

1       1721 MB           3084s       5191s       2107s
2       1007 MB           3251s       5580s       2329s
2       1007 MB           3347s       5591s       2247s
3        782 MB           4020s       6463s       2443s
3        782 MB           3556s       5942s       2386s
3        782 MB           3812s       6159s       2347s
4        649 MB           3-6000s     9-17000s    5-10000s

Using mpirun -np 4 the computer starts swapping a lot, and
performance greatly decreases. Otherwise, it looks OK; the
difference in wall time between 10 iterations and 5 iterations
increases only by 10-15%.

(BTW, all these tests were done with the Democritos CP code, not
PWSCF).

Best,

			nicola


Serguei Patchkovskii wrote:

> Hi folks,
> 
> Since the last time the question on maximizing the amount of
> RAM available for dynamic allocation on Linux/x86 systems came
> up, I had a chance to revisit the issue. As it turns out,
> recent Linux kernels provide a very convenient way of fiddling
> with some of the kernel parameters, which required a kernel
> rebuild before. As of 2.4.21 kernels, TASK_UNMAPPED_BASE can
> be adjusted on a per-process basis. So, now programs which
> need to maximize the amount of ALLOCATE'able memory can get
> it, even with stock kernels.

---------------------------------------------------------------------
Prof Nicola Marzari   Department of Materials Science and Engineering
13-5066   MIT   77 Massachusetts Avenue   Cambridge MA 02139-4307 USA
tel 617.4522758  fax 617.2586534  marzari at mit.edu  http://nnn.mit.edu



More information about the users mailing list