Trouble compiling with mpif90 on JASMIN

I have a newfound problem with compiling JULES on JASMIN. I am compiling on the cylc1 node, but this problem seems to be present on the sci* nodes as well.

The suite is /home/users/pmcguire/roses/u-cg374_Shanghaiv9i3

The error message is:

mpif90 -oo/u_v_grid.o -c -DSCMA -DBL_DIAG_HACK -DINTEL_FORTRAN 
-I./include -I/home/users/siwilson/netcdf.openmpi/include 
-heap-arrays -fp-model precise -traceback
 /home/users/pmcguire/cylc-run/u-cg374_Shanghaiv9i3/share/fcm_make/preprocess/src/jules/src/control/standalone/var/u_v_grid.F90 # rc=1
[FAIL]
[FAIL] Please verify that both the operating system and 
       the processor support Intel(R) F16C and AVX instructions.

JASMIN was updated recently. The cylc1 node is an Intel node.
When I do a
module load jaspy,
then a bare
mpif90
doesn’t give any errors, unlike when I run a bare mpif90 after loading the modules loaded in the suite:

   eval $(rose task-env)
   export PATH=/apps/jasmin/metomi/bin:$PATH
   module load intel/19.0.0
   module load contrib/gnu/gcc/7.3.0
   module load eb/OpenMPI/intel/3.1.1
   module list 2>&1
   env | grep LD_LIBRARY_PATH
   export NETCDF_FORTRAN_ROOT=/home/users/siwilson/netcdf_par/3.1.1/intel.19.0.0/
   export NETCDF_ROOT=/home/users/siwilson/netcdf_par/3.1.1/intel.19.0.0/
   export HDF5_LIBDIR=/home/users/siwilson/netcdf_par/3.1.1/intel.19.0.0/lib
   export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
   export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HDF5_LIBDIR
   env | grep LD_LIBRARY_PATH
 """

But loading the jaspy module doesn’t solve the problem, since we need some of the other modules, and loading jaspy overrides the loading of some of those.

Any suggestions?
Patrick McGuire

This is a jasmin issue:

[siwilson@sci2 ~]$ module load eb/OpenMPI/intel/3.1.1
[siwilson@sci2 ~]$ module load intel/19.0.0
[siwilson@sci2 ~]$ mpif90

Please verify that both the operating system and the processor support Intel(R) F16C and AVX instructions.

[siwilson@sci2 ~]$ which mpif90
/apps/eb/software/OpenMPI/3.1.1-iccifort-2018.3.222-GCC-7.3.0-2.30/bin/mpif90
[siwilson@sci2 ~]$ module load jaspy
[siwilson@sci2 ~]$ mpif90
ifort: command line error: no files specified; for help type "ifort -help"
[siwilson@sci2 ~]$ which mpif90
/apps/jasmin/jaspy/miniconda_envs/jaspy3.8/m3-4.9.2/envs/jaspy3.8-m3-4.9.2-r20211105/bin/mpif90

It appears that /apps/eb/software/OpenMPI/3.1.1-iccifort-2018.3.222-GCC-7.3.0-2.30/bin/mpif90 is now broken.

Patrick,
has this just started happening in the last day?
I’ve been recommending people standardise things with JULES on JASMIN - so I tell them to submit through LOTUS (which I know is slow) and recommend constraining to certain nodes (Skylake for testing, but Ivybridge should be ok too). I don’t know if things are different from these nodes, but I have compiled and run JULES this week in this way.

Have you informed JASMIN? They may have made a mistake???

Dave

Hi Dave:
I just noticed it yesterday. It was working fine a week or two ago. I do the compile in the background on the cylc1 node. Normally, I do the run on SLURM/LOTUS, but that’s not the problem here, since I don’t get that far.

I just emailed the JASMIN support desk about this. Thank you to Simon for his definitive diagnosis.
Patrick

Hi Dave:
You suggested during the meeting today that I compile on LOTUS/SLURM instead of the background. Do you have a suite set up to do this? I would like to see it.
Patrick

The trunk has this - look at https://code.metoffice.gov.uk/trac/jules/browser/main/trunk/rose-stem/include/jasmin/runtime.rc

The fcm make task inherits the lotus settings, which will have something like:

   [[[job]]]
        batch system = slurm
    [[[directives]]]
        --partition = par-multi
        --constraint = "ivybridge128G|skylake348G|broadwell256G"

Thanks, Dave:
So for a 10-minute compile job, I have to wait in the par-multi queue? This often takes half a day of waiting or more. It is a lot easier and quicker to compile in the background. Would short-serial be an option? There are intel processors on there too, but there is often a half of day or more of queueing time on there, too.

I will try it, though.
Patrick

Hi Dave:
Yes, that worked. I compiled JULES with mpif90 on SLURM/LOTUS as a batch job. And it worked fine. But I did need to wait in the queue several hours, if I recall correctly, for a 10 minute compiling job.

And the compiler mpif90 (which was made with ifort intel libraries) still doesn’t work on the cylc1 VM in interactive mode, last I checked. I am hoping the folks at JASMIN will be able to fix that sometime soon, as they said they would try to do. Then I wouldn’t need to use mpif90 on SLURM/LOTUS as a batch job.
Patrick

mpif90 appears to be working on sci2, but is still broken on sci1. I haven’t tested the other nodes.

Thanks for reporting this.
I’ve tried looking at it on the cylc1 node, and there are no errors. I’ve compiled a couple of executables, so it seems to be back again.

Cheers all

I meant to say it isn’t working on sci3 rather than sci1.

excellent! Thank you, Simon and Dave.

I just tried loading the modules on cylc1 and then running mpif90, and I no longer get the error like I was before. I guess the JASMIN team did indeed succeed in fixing this.

Patrick

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.