<html>
<head>
<style>
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
FONT-SIZE: 10pt;
FONT-FAMILY:Tahoma
}
</style>
</head>
<body class='hmmessage'>
<style>.hmmessage P{margin:0px;padding:0px}body.hmmessage{FONT-SIZE: 10pt;FONT-FAMILY:Tahoma}</style>Dear all,<br><br>I built a cluster of 5 computers with intel Core TM 2 Q6600 CPU (quadcore), and 40G memory total (8G each) on S3000AH system board. The network is 1Gbit <font style="font-size: 14px;">Ethernet</font>. I also checked the em64t option in BIOS is on, so I think Q6600 is a cpu using em64t technology. For more information about my CPU, see http://processorfinder.intel.com/details.aspx?sSpec=SL9UM <br>also I typed "more /proc/cpuinfo", the information for my cpu and OS as follows:<br>LSB Version: :core-3.0-amd64:core-3.0-ia32:core-3.0-noarch:graphics-3.0-amd64:graphics-3.0-ia32:graphics-3.0-noarch<br>Distributor ID: RedHatEnterpriseAS<br>Description: Red Hat Enterprise Linux AS release 4 (Nahant Update 4)<br>Release: 4<br>Codename: NahantUpdate4<br>processor : 0<br>vendor_id : GenuineIntel<br>cpu family : 6<br>model : 15<br>model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz<br>stepping : 11<br>cpu MHz : 2400.150<br>cache size : 4096 KB<br>physical id : 0<br>siblings : 4<br>core id : 0<br>cpu cores : 4<br>fpu : yes<br>fpu_exception : yes<br>cpuid level : 10<br>wp : yes<br>flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cx16 xtpr<br>bogomips : 4806.03<br>clflush size : 64<br>cache_alignment : 64<br>address sizes : 36 bits physical, 48 bits virtual<br>power management:<br><br>processor : 1<br>vendor_id : GenuineIntel<br>cpu family : 6<br>model : 15<br>model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz<br>stepping : 11<br>cpu MHz : 2400.150<br>cache size : 4096 KB<br>physical id : 0<br>siblings : 4<br>core id : 2<br>cpu cores : 4<br>fpu : yes<br>fpu_exception : yes<br>cpuid level : 10<br>wp : yes<br>flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cx16 xtpr<br>bogomips : 4798.75<br>clflush size : 64<br>cache_alignment : 64<br>address sizes : 36 bits physical, 48 bits virtual<br>power management:<br><br>processor : 2<br>vendor_id : GenuineIntel<br>cpu family : 6<br>model : 15<br>model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz<br>stepping : 11<br>cpu MHz : 2400.150<br>cache size : 4096 KB<br>physical id : 0<br>siblings : 4<br>core id : 1<br>cpu cores : 4<br>fpu : yes<br>fpu_exception : yes<br>cpuid level : 10<br>wp : yes<br>flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cx16 xtpr<br>bogomips : 4799.49<br>clflush size : 64<br>cache_alignment : 64<br>address sizes : 36 bits physical, 48 bits virtual<br>power management:<br><br>processor : 3<br>vendor_id : GenuineIntel<br>cpu family : 6<br>model : 15<br>model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz<br>stepping : 11<br>cpu MHz : 2400.150<br>cache size : 4096 KB<br>physical id : 0<br>siblings : 4<br>core id : 3<br>cpu cores : 4<br>fpu : yes<br>fpu_exception : yes<br>cpuid level : 10<br>wp : yes<br>flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cx16 xtpr<br>bogomips : 4799.52<br>clflush size : 64<br>cache_alignment : 64<br>address sizes : 36 bits physical, 48 bits virtual<br>power management:<br><br>Therefore, I updated my intel C++ and Fortran Compilers from 10.1.008 to latest vision 10.1.017 for Intel(R) 64 and MKL from 10.0.011 to latest 10.0.3.020, file names displayed on website were l_cc_p_10.1.017_intel64.tar.gz, l_cc_p_10.1.017_intel64.tar.gz and l_mkl_p_10.0.3.020.tgz. After installation of the three, I compiled for em64t vision of blas95 lapack95 in /opt/intel/mkl/10.0.3.020/interfaces/ using ifort under /opt/intel/fce/10.1.017/bin/. Then compiled mpich2 using ifort and icc. But when I compile fftw 2.1.5 an error occurred, so I compile the fftw 2.1.5 using 10.1.008 ifort and icc on other node with same hardware, the scp it to master node. After all above done, I turned to compile QE.<br><br>But to my surprise, QE detected my architecture as amd64, not ia32 or ia64. My first question is does QE support the intel EM64T technology and take advantages from it ?<br><br>At last, I compile the QE using amd64 architecture schedule by intel C++ and Fortran 10.1.017 vision and MKL 10.0.3.020 library, but I find it less efficienct the the QE compiled by intel C++ and Fortran 10.1.008 vision and 10.0.011 library. The efficiency of QE compiled by 10.1.008 compiler and 10.0.011 is about 60% but the QE compiled by 10.1.017compiler is 10% tested by input file like this:<br> &CONTROL<br> title = 'Anatase lattice BFGS' ,<br> calculation = 'vc-relax' ,<br> restart_mode = 'from_scratch' ,<br> outdir = '/home/vega/tmp/' ,<br> pseudo_dir = '/home/vega/espresso-4.0/pseudo/' ,<br> prefix = 'Anatase lattice default' ,<br> etot_conv_thr = 0.000000735 ,<br> forc_conv_thr = 0.0011668141375 ,<br> nstep = 1000 ,<br> /<br> &SYSTEM<br> ibrav = 6,<br> celldm(1) = 7.135605333,<br> celldm(3) = 2.5121822033898305084745762711864,<br> nat = 12,<br> ntyp = 2,<br> ecutwfc = 25 ,<br> ecutrho = 200 ,<br> /<br> &ELECTRONS<br> conv_thr = 7.3D-8 ,<br> /<br> &IONS<br> ion_dynamics = 'bfgs' ,<br> /<br> &CELL<br> cell_dynamics = 'bfgs' ,<br> cell_dofree = 'xyz' ,<br> /<br>ATOMIC_SPECIES<br> Ti 47.86700 Ti.pw91-sp-van_ak.UPF <br> O 15.99940 O.pw91-van_ak.UPF <br>ATOMIC_POSITIONS angstrom <br> Ti 0.000000000 0.000000000 0.000000000 <br> Ti 1.888000000 1.888000000 4.743000000 <br> Ti 0.000000000 1.888000000 2.372000000 <br> Ti 1.888000000 0.000000000 7.115000000 <br> O 0.000000000 0.000000000 1.973000000 <br> O 1.888000000 1.888000000 6.716000000 <br> O 0.000000000 1.888000000 4.345000000 <br> O 1.888000000 0.000000000 9.088000000 <br> O 1.888000000 0.000000000 5.141000000 <br> O 0.000000000 1.888000000 0.398000000 <br> O 1.888000000 1.888000000 2.770000000 <br> O 0.000000000 0.000000000 7.513000000 <br>K_POINTS automatic <br> 7 7 3 1 1 1 <br><br>My second question is about the efficiency: <br>Which compiler and MKL vision is the best one for my cluster?<br>Why I updated my MKL and compilers brings me less efficiency?<br>What is the best efficiency of my cluster can reach ? 60% is low or high for QE?<br><br>Third question is about bfgs cell optimization. When I run the above input file with the 'cell_dofree = 'xyz'' in &CELL section. I think it mean only a,b,c of the lattice are changeable, and three angles, alpha, beta, gamma is fixed to 90 degrees according to the PWgui. So that, the lattice will remain orthogonal. But the results showed the angles were still changing. the results file as follows:<br><br>......<br> entering subroutine stress ...<br><br> total stress (Ry/bohr**3) (kbar) P= -1.85<br> 0.00004989 0.00000000 0.00000000 7.34 0.00 0.00<br> 0.00000000 0.00005560 0.00000000 0.00 8.18 0.00<br> 0.00000000 0.00000000 -0.00014314 0.00 0.00 -21.06<br><br><br> number of scf cycles = 5<br> number of bfgs steps = 2<br><br> enthalpy old = -725.4093146855 Ry<br> enthalpy new = -725.4093492895 Ry<br><br> CASE: enthalpy_new < enthalpy_old<br><br> new trust radius = 0.0190701467 bohr<br> new conv_thr = 0.0000000074 Ry<br><br><br>CELL_PARAMETERS (alat)<br> 0.992368528 0.000000000 -0.000000009<br> 0.000000000 0.992410788 0.000000037<br> -0.000000021 0.000000091 2.503790203<br><br>ATOMIC_POSITIONS (angstrom)<br>Ti 0.000000000 0.000000000 0.000040364<br>Ti 1.873591740 1.873671740 4.727209100<br>Ti -0.000000020 1.873671654 2.362997426<br>Ti 1.873591720 0.000000258 7.091885462<br>O -0.000000017 0.000000072 1.972736887<br>O 1.873591723 1.873671812 6.700607058<br>O -0.000000037 1.873671726 4.336885456<br>O 1.873591703 0.000000330 9.064014758<br>O 1.873591737 0.000000186 5.117531758<br>O -0.000000004 1.873671582 0.390380409<br>O 1.873591756 1.873671668 2.753778039<br>O -0.000000064 0.000000273 7.481645190<br><br><br><br> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br> from checkallsym : error # 2<br> not orthogonal operation<br> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br><br> stopping ...<br>application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0[cli_0]: aborting job:<br>application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0<br>rank 0 in job 3 node5_32785 caused collective abort of all ranks<br> exit status of rank 0: killed by signal 9 <br><br>Do you think, I shloud never using BFGS to optimize Anatase lattice? But CASTEP can do so,why?<br><br>thanking for reading. I'm looking forward to responding.<br><br><br>Vega Lew<br>PH.D Candidate in Chemical Engineering<br>College of Chemistry and Chemical Engineering<br>Nanjing University of Technology, 210009, Nanjing, Jiangsu, China<br /><hr />Discover the new Windows Vista <a href='http://search.msn.com/results.aspx?q=windows+vista&mkt=en-US&form=QBRE' target='_new'>Learn more!</a></body>
</html>