<html><head></head><body><div class="ydp1db16496yahoo-style-wrap" style="font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:16px;"><div dir="ltr" data-setdir="false"><div><div dir="ltr">Hi everyone, <br></div><div dir="ltr">a) I want to compile and use Quantum espresso and use on GPU of HPC.</div><div dir="ltr">I tried several times, but to no avail.</div><div dir="ltr">Kindly, give me a workflow.</div><div dir="ltr"><br></div><div dir="ltr">b) Secondly, on cpu of HPC, i use the below slurm script:</div><div dir="ltr"><div dir="ltr" data-setdir="false">#!/bin/bash<br>#SBATCH -J scf_cpu             # Job name<br>#SBATCH --account=abc@gmail.com<br>#SBATCH -p cpu                 # CPU partition<br>#SBATCH -N 1                   # Number of nodes<br>#SBATCH -n 40                   # Total MPI tasks<br>#SBATCH -t 12:00:00           # Walltime hh:mm:ss<br>#SBATCH -o scf_cpu.%j.out      # Standard output<br>#SBATCH -e scf_cpu.%j.err      # Standard error<br>#SBATCH --mail-user=ish.3@gmail.com<br>#SBATCH --mail-type=BEGIN,END,FAIL<br><br># Load Quantum ESPRESSO module<br>module purge<br>module use /opt/apps/modulefiles<br>module load qe/7.3.1-mpi<br>module load libxc/6.1.0<br>cd $SLURM_SUBMIT_DIR<br><br>##INPUT=scf.in<br>##OUTPUT=scf.out    # ✅ no spaces!<br><br># Run the calculation<br>mpirun -np $SLURM_NTASKS pw.x -inp scf.in > scf.out</div><div><br></div><div><br></div><div dir="ltr">It  is running very slow. <br></div><div dir="ltr">Kindly,  help me out.</div><div dir="ltr">lscpu (HPC):</div><div dir="ltr"><br></div><div dir="ltr"><div>Architecture:                x86_64<br>  CPU op-mode(s):            32-bit, 64-bit<br>  Address sizes:             46 bits physical, 48 bits virtual<br>  Byte Order:                Little Endian<br>CPU(s):                      80<br>  On-line CPU(s) list:       0-79<br>Vendor ID:                   GenuineIntel<br>  Model name:                Intel(R) Xeon(R) Gold 6242R CPU @ 3.10GHz<br>    CPU family:              6<br>    Model:                   85<br>    Thread(s) per core:      2<br>    Core(s) per socket:      20<br>    Socket(s):               2<br>    Stepping:                7<br>    BogoMIPS:                6200.00<br>    Flags:                   fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pg<br>                             e mca cmov pat pse36 clflush dts acpi mmx fxsr sse <br>                             sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm cons<br>                             tant_tsc art arch_perfmon pebs bts rep_good nopl xt<br>                             opology nonstop_tsc cpuid aperfmperf pni pclmulqdq <br>                             dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fm<br>                             a cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movb<br>                             e popcnt tsc_deadline_timer aes xsave avx f16c rdra<br>                             nd lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3<br>                              cdp_l3 intel_ppin ssbd mba ibrs ibpb stibp ibrs_en<br>                             hanced tpr_shadow flexpriority ept vpid ept_ad fsgs<br>                             base tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cq<br>                             m mpx rdt_a avx512f avx512dq rdseed adx smap clflus<br>                             hopt clwb intel_pt avx512cd avx512bw avx512vl xsave<br>                             opt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm<br>                             _mbm_total cqm_mbm_local dtherm ida arat pln pts vn<br>                             mi pku ospke avx512_vnni md_clear flush_l1d arch_ca<br>                             pabilities indirect_thunk_its<br>Virtualization features:     <br>  Virtualization:            VT-x<br>Caches (sum of all):         <br>  L1d:                       1.3 MiB (40 instances)<br>  L1i:                       1.3 MiB (40 instances)<br>  L2:                        40 MiB (40 instances)<br>  L3:                        71.5 MiB (2 instances)<br>NUMA:                        <br>  NUMA node(s):              2<br>  NUMA node0 CPU(s):         0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36<br>                             ,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70<br>                             ,72,74,76,78<br>  NUMA node1 CPU(s):         1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37<br>                             ,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71<br>                             ,73,75,77,79<br>Vulnerabilities:             <br>  Gather data sampling:      Mitigation; Microcode<br>  Indirect target selection: Mitigation; Aligned branch/return thunks<br>  Itlb multihit:             KVM: Mitigation: Split huge pages<br>  L1tf:                      Not affected<br>  Mds:                       Not affected<br>  Meltdown:                  Not affected<br>  Mmio stale data:           Mitigation; Clear CPU buffers; SMT vulnerable<br>  Reg file data sampling:    Not affected<br>  Retbleed:                  Mitigation; Enhanced IBRS<br>  Spec rstack overflow:      Not affected<br>  Spec store bypass:         Mitigation; Speculative Store Bypass disabled via p<br>                             rctl<br>  Spectre v1:                Mitigation; usercopy/swapgs barriers and __user poi<br>                             nter sanitization<br>  Spectre v2:                Mitigation; Enhanced / Automatic IBRS; IBPB conditi<br>                             onal; RSB filling; PBRSB-eIBRS SW sequence; BHI SW <br>                             loop, KVM SW loop<br>  Srbds:                     Not affected<br>  Tsx async abort:           Mitigation; TSX disabled<br></div></div></div></div><br></div></div></body></html>