[Wannier] No seedname_u.mat output + segmentation fault

Payden Brown plbrown5 at asu.edu
Thu Apr 27 01:14:42 CEST 2023


Sorry! The error message was lost. Here it is:

*TACC:  Starting up job 5423502 *
*TACC:  Starting parallel tasks... *
*[c121-002:267668:0:267668] Caught signal 11 (Segmentation fault: address
not mapped to object at address 0x4)*
*[c121-002:267670:0:267670] Caught signal 11 (Segmentation fault: address
not mapped to object at address 0x4)*
*[c121-002:267677:0:267677] Caught signal 11 (Segmentation fault: address
not mapped to object at address 0x4)*
*[c121-002:267679:0:267679] Caught signal 11 (Segmentation fault: address
not mapped to object at address 0x4)*
*[c121-002:267681:0:267681] Caught signal 11 (Segmentation fault: address
not mapped to object at address 0x4)*
*[c121-002:267673:0:267673] Caught signal 11 (Segmentation fault: address
not mapped to object at address 0x4)*
*[c121-002:267672:0:267672] Caught signal 11 (Segmentation fault: address
not mapped to object at address 0x4)*
*[c121-002:267669:0:267669] Caught signal 11 (Segmentation fault: address
not mapped to object at address 0x4)*
*==== backtrace (tid: 267668) ====*
* 0 0x00000000004c7c73 w90_wannierise_mp_wann_write_xyz_()  ???:0*
* 1 0x00000000004af265 w90_wannierise_mp_wann_main_()  ???:0*
* 2 0x000000000040509e MAIN__()  ???:0*
* 3 0x0000000000404a12 main()  ???:0*
* 4 0x0000000000022555 __libc_start_main()  ???:0*
* 5 0x0000000000404929 _start()  ???:0*
*=================================*
*==== backtrace (tid: 267670) ====*
* 0 0x00000000004c7c73 w90_wannierise_mp_wann_write_xyz_()  ???:0*
* 1 0x00000000004af265 w90_wannierise_mp_wann_main_()  ???:0*
* 2 0x000000000040509e MAIN__()  ???:0*
* 3 0x0000000000404a12 main()  ???:0*
* 4 0x0000000000022555 __libc_start_main()  ???:0*
* 5 0x0000000000404929 _start()  ???:0*
*=================================*
*==== backtrace (tid: 267677) ====*
* 0 0x00000000004c7c73 w90_wannierise_mp_wann_write_xyz_()  ???:0*
* 1 0x00000000004af265 w90_wannierise_mp_wann_main_()  ???:0*
* 2 0x000000000040509e MAIN__()  ???:0*
* 3 0x0000000000404a12 main()  ???:0*
* 4 0x0000000000022555 __libc_start_main()  ???:0*
* 5 0x0000000000404929 _start()  ???:0*
*=================================*
*==== backtrace (tid: 267681) ====*
* 0 0x00000000004c7c73 w90_wannierise_mp_wann_write_xyz_()  ???:0*
* 1 0x00000000004af265 w90_wannierise_mp_wann_main_()  ???:0*
* 2 0x000000000040509e MAIN__()  ???:0*
* 3 0x0000000000404a12 main()  ???:0*
* 4 0x0000000000022555 __libc_start_main()  ???:0*
* 5 0x0000000000404929 _start()  ???:0*
*=================================*
*==== backtrace (tid: 267669) ====*
* 0 0x00000000004c7c73 w90_wannierise_mp_wann_write_xyz_()  ???:0*
* 1 0x00000000004af265 w90_wannierise_mp_wann_main_()  ???:0*
* 2 0x000000000040509e MAIN__()  ???:0*
* 3 0x0000000000404a12 main()  ???:0*
* 4 0x0000000000022555 __libc_start_main()  ???:0*
* 5 0x0000000000404929 _start()  ???:0*
*=================================*
*==== backtrace (tid: 267673) ====*
* 0 0x00000000004c7c73 w90_wannierise_mp_wann_write_xyz_()  ???:0*
* 1 0x00000000004af265 w90_wannierise_mp_wann_main_()  ???:0*
* 2 0x000000000040509e MAIN__()  ???:0*
* 3 0x0000000000404a12 main()  ???:0*
* 4 0x0000000000022555 __libc_start_main()  ???:0*
* 5 0x0000000000404929 _start()  ???:0*
*=================================*
*==== backtrace (tid: 267672) ====*
* 0 0x00000000004c7c73 w90_wannierise_mp_wann_write_xyz_()  ???:0*
* 1 0x00000000004af265 w90_wannierise_mp_wann_main_()  ???:0*
* 2 0x000000000040509e MAIN__()  ???:0*
* 3 0x0000000000404a12 main()  ???:0*
* 4 0x0000000000022555 __libc_start_main()  ???:0*
* 5 0x0000000000404929 _start()  ???:0*
*=================================*
*==== backtrace (tid: 267679) ====*
* 0 0x00000000004c7c73 w90_wannierise_mp_wann_write_xyz_()  ???:0*
* 1 0x00000000004af265 w90_wannierise_mp_wann_main_()  ???:0*
* 2 0x000000000040509e MAIN__()  ???:0*
* 3 0x0000000000404a12 main()  ???:0*
* 4 0x0000000000022555 __libc_start_main()  ???:0*
* 5 0x0000000000404929 _start()  ???:0*
*=================================*
*forrtl: severe (174): SIGSEGV, segmentation fault occurred*
*Image              PC                Routine            Line        Source
            *
*wannier90.x        000000000055C5BA  Unknown               Unknown
 Unknown*
*libpthread-2.17.s  00002AB2D3AFF630  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004C7C73  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004AF265  Unknown               Unknown
 Unknown*
*wannier90.x        000000000040509E  Unknown               Unknown
 Unknown*
*wannier90.x        0000000000404A12  Unknown               Unknown
 Unknown*
*libc-2.17.so <http://libc-2.17.so>       00002AB2D5B5A555
 __libc_start_main     Unknown  Unknown*
*wannier90.x        0000000000404929  Unknown               Unknown
 Unknown*
*forrtl: severe (174): SIGSEGV, segmentation fault occurred*
*Image              PC                Routine            Line        Source
            *
*wannier90.x        000000000055C5BA  Unknown               Unknown
 Unknown*
*libpthread-2.17.s  00002B45C5251630  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004C7C73  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004AF265  Unknown               Unknown
 Unknown*
*wannier90.x        000000000040509E  Unknown               Unknown
 Unknown*
*wannier90.x        0000000000404A12  Unknown               Unknown
 Unknown*
*libc-2.17.so <http://libc-2.17.so>       00002B45C72AC555
 __libc_start_main     Unknown  Unknown*
*wannier90.x        0000000000404929  Unknown               Unknown
 Unknown*
*forrtl: severe (174): SIGSEGV, segmentation fault occurred*
*Image              PC                Routine            Line        Source
            *
*wannier90.x        000000000055C5BA  Unknown               Unknown
 Unknown*
*libpthread-2.17.s  00002AD8C43D6630  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004C7C73  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004AF265  Unknown               Unknown
 Unknown*
*wannier90.x        000000000040509E  Unknown               Unknown
 Unknown*
*wannier90.x        0000000000404A12  Unknown               Unknown
 Unknown*
*libc-2.17.so <http://libc-2.17.so>       00002AD8C6431555
 __libc_start_main     Unknown  Unknown*
*wannier90.x        0000000000404929  Unknown               Unknown
 Unknown*
*forrtl: severe (174): SIGSEGV, segmentation fault occurred*
*Image              PC                Routine            Line        Source
            *
*wannier90.x        000000000055C5BA  Unknown               Unknown
 Unknown*
*libpthread-2.17.s  00002B5E3EA5E630  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004C7C73  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004AF265  Unknown               Unknown
 Unknown*
*wannier90.x        000000000040509E  Unknown               Unknown
 Unknown*
*wannier90.x        0000000000404A12  Unknown               Unknown
 Unknown*
*libc-2.17.so <http://libc-2.17.so>       00002B5E40AB9555
 __libc_start_main     Unknown  Unknown*
*wannier90.x        0000000000404929  Unknown               Unknown
 Unknown*
*forrtl: severe (174): SIGSEGV, segmentation fault occurred*
*Image              PC                Routine            Line        Source
            *
*wannier90.x        000000000055C5BA  Unknown               Unknown
 Unknown*
*libpthread-2.17.s  00002B9771B42630  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004C7C73  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004AF265  Unknown               Unknown
 Unknown*
*wannier90.x        000000000040509E  Unknown               Unknown
 Unknown*
*wannier90.x        0000000000404A12  Unknown               Unknown
 Unknown*
*libc-2.17.so <http://libc-2.17.so>       00002B9773B9D555
 __libc_start_main     Unknown  Unknown*
*wannier90.x        0000000000404929  Unknown               Unknown
 Unknown*
*forrtl: severe (174): SIGSEGV, segmentation fault occurred*
*Image              PC                Routine            Line        Source
            *
*wannier90.x        000000000055C5BA  Unknown               Unknown
 Unknown*
*libpthread-2.17.s  00002B67D277E630  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004C7C73  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004AF265  Unknown               Unknown
 Unknown*
*wannier90.x        000000000040509E  Unknown               Unknown
 Unknown*
*wannier90.x        0000000000404A12  Unknown               Unknown
 Unknown*
*libc-2.17.so <http://libc-2.17.so>       00002B67D47D9555
 __libc_start_main     Unknown  Unknown*
*wannier90.x        0000000000404929  Unknown               Unknown
 Unknown*
*forrtl: severe (174): SIGSEGV, segmentation fault occurred*
*Image              PC                Routine            Line        Source
            *
*wannier90.x        000000000055C5BA  Unknown               Unknown
 Unknown*
*libpthread-2.17.s  00002AF7E5406630  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004C7C73  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004AF265  Unknown               Unknown
 Unknown*
*wannier90.x        000000000040509E  Unknown               Unknown
 Unknown*
*wannier90.x        0000000000404A12  Unknown               Unknown
 Unknown*
*libc-2.17.so <http://libc-2.17.so>       00002AF7E7461555
 __libc_start_main     Unknown  Unknown*
*wannier90.x        0000000000404929  Unknown               Unknown
 Unknown*
*forrtl: severe (174): SIGSEGV, segmentation fault occurred*
*Image              PC                Routine            Line        Source
            *
*wannier90.x        000000000055C5BA  Unknown               Unknown
 Unknown*
*libpthread-2.17.s  00002B81C2B60630  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004C7C73  Unknown               Unknown
 Unknown*
*wannier90.x        00000000004AF265  Unknown               Unknown
 Unknown*
*wannier90.x        000000000040509E  Unknown               Unknown
 Unknown*
*wannier90.x        0000000000404A12  Unknown               Unknown
 Unknown*
*libc-2.17.so <http://libc-2.17.so>       00002B81C4BBB555
 __libc_start_main     Unknown  Unknown*
*wannier90.x        0000000000404929  Unknown               Unknown
 Unknown*

*===================================================================================*
*=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES*
*=   RANK 0 PID 267667 RUNNING AT c121-002*
*=   KILLED BY SIGNAL: 9 (Killed)*
*===================================================================================*

*===================================================================================*
*=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES*
*=   RANK 4 PID 267671 RUNNING AT c121-002*
*=   KILLED BY SIGNAL: 9 (Killed)*
*===================================================================================*

*===================================================================================*
*=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES*
*=   RANK 7 PID 267674 RUNNING AT c121-002*
*=   KILLED BY SIGNAL: 9 (Killed)*
*===================================================================================*

*===================================================================================*
*=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES*
*=   RANK 8 PID 267675 RUNNING AT c121-002*
*=   KILLED BY SIGNAL: 9 (Killed)*
*===================================================================================*

*===================================================================================*
*=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES*
*=   RANK 9 PID 267676 RUNNING AT c121-002*
*=   KILLED BY SIGNAL: 9 (Killed)*
*===================================================================================*

*===================================================================================*
*=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES*
*=   RANK 11 PID 267678 RUNNING AT c121-002*
*=   KILLED BY SIGNAL: 9 (Killed)*
*===================================================================================*

*===================================================================================*
*=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES*
*=   RANK 13 PID 267680 RUNNING AT c121-002*
*=   KILLED BY SIGNAL: 9 (Killed)*
*===================================================================================*

*===================================================================================*
*=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES*
*=   RANK 15 PID 267682 RUNNING AT c121-002*
*=   KILLED BY SIGNAL: 9 (Killed)*
*===================================================================================*
*TACC:  MPI job exited with code: 255 *
*TACC:  Shutdown complete. Exiting.*

On Tue, Apr 25, 2023 at 6:18 PM Payden Brown <plbrown5 at asu.edu> wrote:

> Hello all,
>
> I am trying to run wannier90.x in parallel on PbWO4. At first, I was
> running into a segmentation fault failure at iteration ~300 out of 5000
> iterations. Thinking it could be a stack or memory issue, I changed the
> soft ulimit to unlimited and I am using the "optimisation=0" setting for
> memory optimization.
> When looking at pbwo4.wout, now the "wannierise" calculation runs all 5000
> iterations and it gets to the final WF state, then either stops while
> pbwo4_centres.xyz is being written or finishes writing and then stops
> without outputting pbwo4_u.mat. When I check the job output it still says
> there are segmentation faults. I am new to using wannier90 so I don't know
> if there is an obvious or common fix to this issue. I don't know if the
> issue is with my input or how I compiled. Description of system and input
> files below:
>
> System: TACC Frontera compiled with intel19 and the make.inc.ifort file.
> I'm using the MKL library and modified the mkl libpath in make.inc.ifort to
> point to the right location. No issues when compiling and "wannier90.x -pp"
> runs without any issues too.
>
> contents of pbwo4.win (1000 kpoints are truncated):
> begin projections
> random
> end projections
> guiding_centres=true
>
> num_bands = 16
> num_wann = 16
>
> iprint = 2
> dis_num_iter =  500
> dis_win_max =   9.0
> dis_froz_max =  9.0
> dis_conv_tol     = 1.0d-9
> dis_conv_window  = 20
>
> num_iter  =   5000
> conv_tol        = 1.0d-9
> conv_window     = 20
> mp_grid = 10 10 10
>
> begin unit_cell_cart
> Ang
> 0.0000000000000000    5.6128862829189758    0.0000000000000000
> 4.9874188999999998    0.0000000000000000    0.0000000000000000
> 0.0000000000000000   -4.2185204449493545  -12.9777995413697234
> end unit_cell_cart
>
> write_u_matrices = .true.
> write_xyz = .true.
>
> #restart = plot
> BANDS_PLOT = TRUE
> BANDS_PLOT_FORMAT = gnuplot
> BANDS_NUM_POINTS = 100
>
> BEGIN KPOINT_PATH
> G 0.000 0.000 0.000  Z 0.000 0.500 0.000
> Z 0.000 0.500 0.000  D 0.000 0.500 0.500
> D 0.000 0.500 0.500  B 0.000 0.000 0.500
> B 0.000 0.000 0.500  G 0.000 0.000 0.000
> G 0.000 0.000 0.000  A -0.500 0.000 0.500
> A -0.500 0.000 0.500  E -0.500 0.500 0.500
> E -0.500 0.500 0.500  Z 0.000 0.500 0.000
> Z 0.000 0.500 0.000  C -0.500 0.500 0.000
> C -0.500 0.500 0.000  Y -0.500 0.000 0.000
> Y -0.500 0.000 0.000  G 0.000 0.000 0.000
> END KPOINT_PATH
>
> begin atoms_frac
> Pb   0.1679799000000000    0.3033784899999999    0.6510552800000000
> Pb   0.8320201000000000    0.8033784900000001    0.8489447200000000
> Pb   0.8320201000000000    0.6966215100000001    0.3489447200000000
> Pb   0.1679799000000000    0.1966215100000001    0.1510552800000000
> W    0.6119139800000002    0.7507408600000001    0.5773005600000001
> W    0.3880860199999998    0.2507408600000001    0.9226994399999999
> W    0.3880860199999998    0.2492591399999999    0.4226994399999999
> W    0.6119139800000002    0.7492591399999999    0.0773005600000001
> O    0.7282079600000000    0.4559118599999999    0.5179289499999999
> O    0.2717920400000000    0.9559118600000001    0.9820710500000001
> O    0.2717920400000000    0.5440881400000001    0.4820710500000001
> O    0.7282079600000000    0.0440881400000001    0.0179289499999999
> O    0.3917558500000000    0.0565561700000001    0.5614836900000000
> O    0.6082441500000000    0.5565561700000001    0.9385163100000000
> O    0.6082441500000000    0.9434438299999999    0.4385163100000000
> O    0.3917558500000000    0.4434438299999999    0.0614836900000000
> O    0.9097576700000001    0.8854765500000000    0.6490210500000000
> O    0.0902423299999999    0.3854765500000000    0.8509789500000000
> O    0.0902423299999999    0.1145234500000000    0.3509789500000000
> O    0.9097576700000001    0.6145234500000000    0.1490210500000000
> O    0.5403943700000000    0.6068396200000001    0.6868019599999999
> O    0.4596056300000000    0.1068396200000001    0.8131980400000001
> O    0.4596056300000000    0.3931603799999999    0.3131980400000001
> O    0.5403943700000000    0.8931603799999999    0.1868019599999999
> end atoms_frac
>
> begin kpoints
>   0.00000000  0.00000000  0.00000000
>   0.00000000  0.00000000  0.10000000
>   0.00000000  0.00000000  0.20000000
>   0.00000000  0.00000000  0.30000000
>                            .
>                            .
>                            .
> Error messages:
>
> [image: Screen Shot 2023-04-25 at 1.42.05 PM.png]
> [image: Screen Shot 2023-04-25 at 1.42.26 PM.png]
>
> Your help and suggestions are very much appreciated,
> Payden
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/wannier/attachments/20230426/b2887987/attachment-0001.html>


More information about the Wannier mailing list