I have a newfound problem with compiling JULES on JASMIN. I am compiling on the cylc1 node, but this problem seems to be present on the sci* nodes as well.
The suite is /home/users/pmcguire/roses/u-cg374_Shanghaiv9i3
The error message is:
mpif90 -oo/u_v_grid.o -c -DSCMA -DBL_DIAG_HACK -DINTEL_FORTRAN
-I./include -I/home/users/siwilson/netcdf.openmpi/include
-heap-arrays -fp-model precise -traceback
/home/users/pmcguire/cylc-run/u-cg374_Shanghaiv9i3/share/fcm_make/preprocess/src/jules/src/control/standalone/var/u_v_grid.F90 # rc=1
[FAIL]
[FAIL] Please verify that both the operating system and
the processor support Intel(R) F16C and AVX instructions.
JASMIN was updated recently. The cylc1 node is an Intel node.
When I do a module load jaspy,
then a bare mpif90
doesn’t give any errors, unlike when I run a bare mpif90 after loading the modules loaded in the suite:
But loading the jaspy module doesn’t solve the problem, since we need some of the other modules, and loading jaspy overrides the loading of some of those.
[siwilson@sci2 ~]$ module load eb/OpenMPI/intel/3.1.1
[siwilson@sci2 ~]$ module load intel/19.0.0
[siwilson@sci2 ~]$ mpif90
Please verify that both the operating system and the processor support Intel(R) F16C and AVX instructions.
[siwilson@sci2 ~]$ which mpif90
/apps/eb/software/OpenMPI/3.1.1-iccifort-2018.3.222-GCC-7.3.0-2.30/bin/mpif90
[siwilson@sci2 ~]$ module load jaspy
[siwilson@sci2 ~]$ mpif90
ifort: command line error: no files specified; for help type "ifort -help"
[siwilson@sci2 ~]$ which mpif90
/apps/jasmin/jaspy/miniconda_envs/jaspy3.8/m3-4.9.2/envs/jaspy3.8-m3-4.9.2-r20211105/bin/mpif90
It appears that /apps/eb/software/OpenMPI/3.1.1-iccifort-2018.3.222-GCC-7.3.0-2.30/bin/mpif90 is now broken.
Patrick,
has this just started happening in the last day?
I’ve been recommending people standardise things with JULES on JASMIN - so I tell them to submit through LOTUS (which I know is slow) and recommend constraining to certain nodes (Skylake for testing, but Ivybridge should be ok too). I don’t know if things are different from these nodes, but I have compiled and run JULES this week in this way.
Have you informed JASMIN? They may have made a mistake???
Hi Dave:
I just noticed it yesterday. It was working fine a week or two ago. I do the compile in the background on the cylc1 node. Normally, I do the run on SLURM/LOTUS, but that’s not the problem here, since I don’t get that far.
I just emailed the JASMIN support desk about this. Thank you to Simon for his definitive diagnosis.
Patrick
Hi Dave:
You suggested during the meeting today that I compile on LOTUS/SLURM instead of the background. Do you have a suite set up to do this? I would like to see it.
Patrick
Thanks, Dave:
So for a 10-minute compile job, I have to wait in the par-multi queue? This often takes half a day of waiting or more. It is a lot easier and quicker to compile in the background. Would short-serial be an option? There are intel processors on there too, but there is often a half of day or more of queueing time on there, too.
Hi Dave:
Yes, that worked. I compiled JULES with mpif90 on SLURM/LOTUS as a batch job. And it worked fine. But I did need to wait in the queue several hours, if I recall correctly, for a 10 minute compiling job.
And the compiler mpif90 (which was made with ifort intel libraries) still doesn’t work on the cylc1 VM in interactive mode, last I checked. I am hoping the folks at JASMIN will be able to fix that sometime soon, as they said they would try to do. Then I wouldn’t need to use mpif90 on SLURM/LOTUS as a batch job.
Patrick
Thanks for reporting this.
I’ve tried looking at it on the cylc1 node, and there are no errors. I’ve compiled a couple of executables, so it seems to be back again.
I just tried loading the modules on cylc1 and then running mpif90, and I no longer get the error like I was before. I guess the JASMIN team did indeed succeed in fixing this.