[Q-e-developers] Segfaults with reused FFTW plans due to unaligned memory on i386

Michael Banck mbanck at gmx.net
Tue Dec 10 21:47:08 CET 2013


We saw lots of segfaults in the testsuite for Debian build on i386
(32bit Intel architecture), see e.g.


|Checking atom-lsda...passed
|Program received signal SIGSEGV: Segmentation fault - invalid memory
|Backtrace for this error:
|#0  0x565AC033
|#1  0x565AC6C0
|#2  0x555763FF
|#3  0x562B53D3
|#4  0x561EDA0B
|#5  0x561F0602
|#6  0x561F0B2A
|#7  0x562903E5
|#8  0x8207995 in __fft_scalar_MOD_cft_2xy at fft_scalar.f90:799
|#9  0x82038B2 in __fft_parallel_MOD_tg_cft3s at fft_parallel.f90:138
|#10  0x8202A37 in invfft_x_ at fft_interfaces.f90:149
|#11  0x81C431A in vloc_psi_gamma_ at vloc_psi.f90:124
|#12  0x8188FF4 in h_psi_ at h_psi.f90:112
|#13  0x81B1000 in rotate_wfc_gamma_ at rotate_wfc_gamma.f90:60
|#14  0x81AE467 in rotate_wfc_ at rotate_wfc.f90:67
|#15  0x810619B in init_wfc_ at wfcinit.f90:281 (discriminator 2)
|#16  0x810680F in wfcinit_ at wfcinit.f90:131
|#17  0x8070F49 in init_run_ at init_run.f90:96
|#18  0x804EEA9 in pwscf at pwscf.f90:95
|Segmentation fault
|Checking atom-pbe...FAILED with error condition!

This is apparently because FFTW is compiled for SSE2 on Debian, so it
expects 16-byte aligned arrays.  However, as far as I understand, plans
are being reused for performance reasons and if the input arrays differ
from the original input used for the plan, they might no longer be aligned
properly for the SSE2 code and a segfault occurs.  Thus this only happens
on 32bit architectures using SSE2 (we had the same issue with CP2K,

I made a patch for the Debian package which optionally adds
FFTW_UNALIGNED to the plan bitmask in Modules/fft_scalar.f90 if the code
is being compiled with -D_FFTW_FORCE_UNALIGNED.  To make things easy, I
just added support for an environment variable $(FFTW_FORCE_ALIGN) for
DFLAGS in install/make.sys.in, so one can set this to get QE compiled
with -D_FFTW_FORCE_UNALIGNED.  I am sure it could be named better or
possibly even integrated with configure, but I wanted to at least share
the patch we use in Debian, see attached.

Further information can be found here:


A different, more involved, workaround would be to use fftw_malloc() for
the input arrays.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: fftw_i386_unaligned.patch
Type: text/x-diff
Size: 6783 bytes
Desc: not available
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20131210/bb631d50/attachment.bin>

More information about the developers mailing list