[Pw_forum] difference between nr1 and nrx1

Stefano de Gironcoli degironc at sissa.it
Thu May 1 17:23:12 CEST 2008


Dear Timo,

    the nrxx is the real dimension of the FFT array that in a scalar run 
in nrx1*nrx2*nrx3
and in a parallel run is nrx1*nrx2*  dfftp%npp(me_pool+1) where 
npp(me_pool+1) is the
number of fft x-y planes  belonging  to  the  processor me_pool  in a 
given pool.
    In case nrx1/=nr1 there are extra positions in the array that 
usually contain zeros although
is it safer not counting on that.

    In order to do the operation you describe we typically would do 
something like
===
#if defined (__PARA)
      idx0 = nrx1*nrx2 * SUM ( dfftp%npp(1:me_pool) )
#else
      idx0 = 0
#endif
      DO ifft = 1, nrxx
            ! ... define three indices (xfft,yfft,zfft)
            idx   = idx0 + ifft - 1
            zfft   = idx / (nrx1*nrx2)
            idx   = idx - (nrx1*nrx2)*zfft
            yfft   = idx / nrx1
            idx   = idx - nrx1*yfft
            xfft   = idx
            !
            ! ... do not include points outside the physical range
            !
            IF ( xfft >= nr1 .or. yfft >= nr2 .or. zfft >= nr3 ) CYCLE
            rho =  rho_val(ifft,1) + rho_core(ifft)
            r_pos(ifft,1) = dble(xfft)/dble(nr1)
            r_pos(ifft,2) = dble(yfft)/dble(nr2)
            r_pos(ifft,3) = dble(zfft)/dble(nr3)
            !
       END DO
====
that also works in parallel
(see for instance subroutine qpointlist in PW/realus or 
PW/make_pointlist.f90)

If you need to access the whole density in a random access way, as for 
instance
in the symmetrization of the charge density, this is problematic due to 
the fact that
the density is distributed and the only simple strategy we found was to 
collect all
density slices in a local array  (dimensioned nrx1*nrx2*nrx3) and 
perform the
symmetrization in scalar mode.... you can see the 
gather/symmetrize/scatter procedure
in PW/psymrho.f90 routine... obviously this is fine because the 
symmetrization is not
computationally intensive and therefore we do not loose too much in 
doing it in scalar
but in other situations other strategies may be needed.

hope this helps,
   stefano

Timo Thonhauser wrote:
> Dear Stefano,
>
> Thanks - that helped a lot! However, I have one more question:
> I have to scan and work with the density in such a way that
> I want to know the real space position of each grid point.
> So what I do is:
>
>     ifft = 0
>
>     do zfft=0, nr3-1
>     do yfft=0, nr2-1
>     do xfft=0, nr1-1
>      !
>      ifft  = ifft + 1
>      rho   = rho_val(ifft,1) + rho_core(ifft)
>
>      ...
>
>      r_pos(ifft,1)   = dble(xfft)/dble(nr1)
>      r_pos(ifft,2)   = dble(yfft)/dble(nr2)
>      r_pos(ifft,3)   = dble(zfft)/dble(nr3)
>      !
>     end do
>     end do
>     end do
>
> Say, we are running the code on a gray and nrx1 was different
> from nr1. The loop over xfft above would stay the same and there
> is not need to let it run over xfft=0 to nrx1-1, right?
>
> As I understand it, the nrx? are only constructs to make the
> fft easier on certain machines, but there are no extra real
> space density grid points associated with them. If we use only
> one processor then nr1*nr2*nr3 = nrxx and the charge density is
> defined as rho(1:nrxx). But if nr? is different from nrx? then
> nrx1*nrx2*nrx3 IS NOT nrxx, but yet the charge density still
> only has rho(1:nrxx) values, right?
>
> Thanks for all you help!
>
> Best, Timo
>
>
>
>
> pw_forum-request at pwscf.org wrote:
>   
>> Send Pw_forum mailing list submissions to
>> 	pw_forum at pwscf.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> 	http://www.democritos.it/mailman/listinfo/pw_forum
>> or, via email, send a message with subject or body 'help' to
>> 	pw_forum-request at pwscf.org
>>
>> You can reach the person managing the list at
>> 	pw_forum-owner at pwscf.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Pw_forum digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Re: difference between nr1 and nrx1 (Stefano de Gironcoli)
>>    2. Re: difference between nr1 and nrx1 (Stefano de Gironcoli)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Thu, 01 May 2008 00:52:19 +0200
>> From: Stefano de Gironcoli <degironc at sissa.it>
>> Subject: Re: [Pw_forum] difference between nr1 and nrx1
>> To: PWSCF Forum <pw_forum at pwscf.org>
>> Message-ID: <4818F823.1020202 at sissa.it>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Dear Timo,
>>   due to memory access conflicts for certain architectures  (cray was 
>> one of these)  FFT routines were significantly slower if some of the 
>> nrx1,nrx2,nrx3 (nr1 in scalar and nr3 in parallel executions) was a 
>> multiple of 2  while obviously nr1,nr2,nr3 are best when they are powers 
>> of 2. It was therefore convenient to have the flexibility to set 
>> nrx1=nr1+1 (or nrx3=nr3+1) when nr1(nr3) was even.
>>   Now this is not anymore the case on most machines but the distinction 
>> between nr? and nrx? remains and in some case it is still useful.
>>   Best regards,
>>    Stefano de Gironcoli
>>
>> Timo Thonhauser wrote:
>>   
>>     
>>> Dear Developers,
>>>
>>> What is the difference between nr1, nr2, nr3 and nrx1, nrx2, nrx3
>>> in the code? For all practical purposes they always seem to be
>>> the same.
>>>
>>> In PW/pwcom.f90 we find:
>>>
>>>    nr1,           &! fft dimension along x
>>>    nr2,           &! fft dimension along y
>>>    nr3,           &! fft dimension along z
>>>    nrx1,          &! maximum fft dimension along x
>>>    nrx2,          &! maximum fft dimension along y
>>>    nrx3,          &! maximum fft dimension along z
>>>
>>> But when would nr1 differ from nrx1?
>>>
>>> Thanks a lot!
>>> Timo
>>>
>>>   
>>>     
>>>       
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Thu, 01 May 2008 00:59:40 +0200
>> From: Stefano de Gironcoli <degironc at sissa.it>
>> Subject: Re: [Pw_forum] difference between nr1 and nrx1
>> To: PWSCF Forum <pw_forum at pwscf.org>
>> Message-ID: <4818F9DC.4020901 at sissa.it>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> just to complete the information, this check is done in the 
>> good_fft_dimension function in Modules/fft_scalar.f90
>> stefano
>>
>> Stefano de Gironcoli wrote:
>>   
>>     
>>> Dear Timo,
>>>   due to memory access conflicts for certain architectures  (cray was 
>>> one of these)  FFT routines were significantly slower if some of the 
>>> nrx1,nrx2,nrx3 (nr1 in scalar and nr3 in parallel executions) was a 
>>> multiple of 2  while obviously nr1,nr2,nr3 are best when they are powers 
>>> of 2. It was therefore convenient to have the flexibility to set 
>>> nrx1=nr1+1 (or nrx3=nr3+1) when nr1(nr3) was even.
>>>   Now this is not anymore the case on most machines but the distinction 
>>> between nr? and nrx? remains and in some case it is still useful.
>>>   Best regards,
>>>    Stefano de Gironcoli
>>>
>>> Timo Thonhauser wrote:
>>>   
>>>     
>>>       
>>>> Dear Developers,
>>>>
>>>> What is the difference between nr1, nr2, nr3 and nrx1, nrx2, nrx3
>>>> in the code? For all practical purposes they always seem to be
>>>> the same.
>>>>
>>>> In PW/pwcom.f90 we find:
>>>>
>>>>    nr1,           &! fft dimension along x
>>>>    nr2,           &! fft dimension along y
>>>>    nr3,           &! fft dimension along z
>>>>    nrx1,          &! maximum fft dimension along x
>>>>    nrx2,          &! maximum fft dimension along y
>>>>    nrx3,          &! maximum fft dimension along z
>>>>
>>>> But when would nr1 differ from nrx1?
>>>>
>>>> Thanks a lot!
>>>> Timo
>>>>
>>>>   
>>>>     
>>>>       
>>>>         
>>> _______________________________________________
>>> Pw_forum mailing list
>>> Pw_forum at pwscf.org
>>> http://www.democritos.it/mailman/listinfo/pw_forum
>>>   
>>>     
>>>       
>>
>> ------------------------------
>>
>> _______________________________________________
>> Pw_forum mailing list
>> Pw_forum at pwscf.org
>> http://www.democritos.it/mailman/listinfo/pw_forum
>>
>>
>> End of Pw_forum Digest, Vol 11, Issue 1
>> ***************************************
>>   
>>     
>
>   




More information about the users mailing list