JULES on JASMIN: non-INTEL or reduced-INTEL eventuality

Hello DaveC & SimonW:
The JASMIN team have advised me that some of the INTEL nodes have been decommissioned. See:

The JASMIN team have also said that the currently “available Intel nodes will be reaching their end of life. There is no guarantee that new Intel nodes will be procured in the near future.”

There are INTEL libraries used by the standard JULES FLUXNET suite u-al752 and other JULES gridded suites (e.g. u-as052) . See:

  [[JASMIN]]
        env-script = """
                eval $(rose task-env)
                export PATH=/apps/jasmin/metomi/bin:$PATH
                module load intel/19.0.0
                module load contrib/gnu/gcc/7.3.0
                module load eb/OpenMPI/intel/3.1.1
                module list 2>&1
                env | grep LD_LIBRARY_PATH
                export NETCDF_FORTRAN_ROOT=/gws/nopw/j04/jules/admin/netcdf/netcdf_par/3.1.1/intel.19.0.0/
                export NETCDF_ROOT=/gws/nopw/j04/jules/admin/netcdf/netcdf_par/3.1.1/intel.19.0.0/
                export HDF5_LIBDIR=/gws/nopw/j04/jules/admin/netcdf/netcdf_par/3.1.1/intel.19.0.0/lib
                export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
                export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HDF5_LIBDIR
                env | grep LD_LIBRARY_PATH
                """

Furthermore, the standard make commands used by these suites for JULES and by the JULES trunk on JASMIN may have some INTEL-specific compiler optimization flags in them. This has caused issues historically, when compiling JULES with one processor type on JASMIN and running on another processor type on JASMIN.

The JASMIN team have advised us to prepare for the possible non-INTEL or reduced-INTEL eventuality. Can you help/advise?
Thanks,
Patrick McGuire

Patrick,

I’m afraid I don’t know what the advice that JASMIN gave you was in relation to. LOTUS has various types of Intel node model. The trunk for JULES on JASMIN specifies using skylake for the rose stem testing, and if you want advice, I would go for that generally. You can compare to KGO in the group workspace, and it’s not the oldest Intel node compared to some of the other things on JASMIN. There are 151 nodes too (LOTUS cluster specification - JASMIN help docs)
I don’t know what JASMIN are intending on doing, but it would be a surprise if this architecture was removed. They have broadwell which are a bit older, so presumably those would be removed beforehand?

For compilers and libraries - sometimes these things things do get moved around, which can cause issues.

I would probably clarify the message that JASMIN gave you. If all Intel nodes are being removed, this is half of LOTUS. If this is the case then when is it planned for? If it’s only the case that some of the Intel nodes are being removed in the intermediate future, then perhaps stick with skylake.

Does this help?
Dave

Hi Dave
I directly quoted the JASMIN team above, about the Intel nodes. I will send you the email.
Patrick

I also quote myself from an NCAS CMS Slack message here: “A lot of this comes from the overloading of the short-serial queue. JULES often runs in less than 4 hours, so having the option to submit JULES to the short-serial-4hr queue could help. But as the email from the JASMIN team explains, there is no plan to add INTEL nodes to the short-serial-4hr partition. And JULES currently needs INTEL on JASMIN. This is one reason for the urging from the JASMIN team and for the NCAS CMS HELPDESK ticket that I submitted about having an AMD option for JULES on JASMIN.”

But this request is more general than short-serial-4hr access for JULES.
Patrick

I quote myself again from a 2nd NCAS CMS Slack message: “Often, when people submit a JULES job to SLURM, after compiling it in the background on the (INTEL) cylc1 virtual machine node, it needs to also get sent to run on an INTEL SLURM node. If it isn’t sent on an INTEL node, and ends up on an AMD node, then JULES often crashes saying that it doesn’t have access to the INTEL instruction set, or something like that. There have been a number of tickets in both the old and new CMS Helpdesks about this. It would be great to be able to run JULES on an AMD node.”
Patrick

Hi,

I my experience, Intel fortran will normally produce an architecture
independent
x86 exec unless explicitly told not to. This is usually done via the
-xhost or -Qsomething
compile flags.

Do we know in which routine the JULES exec is crashing on AMD?

I’m guessing the issue comes from one of these possibilities:
The JULES compile config
The pre-compiled Netcdf/HDF5 libraries
The MPI libraries provided by jasmin.
The MPI compile wrappers provided by jasmin.

Are there output compile and run log files of a JULES suite which fails
on AMD which I can look at?

Simon.

Thanks, Simon.
I just modified the standard JULES FLUXNET suite u-al752, so that it will compile in the background on the INTEL cylc1 VM, and so that it will run on AMD in LOTUS/SLURM in the AMD-only short-serial-4hr queue/partition.

I get this error, for example (in ~pmcguire/cylc-run/u-al752AMD/log/job/1/jules_us_wpt_presc0/01/job.err):

“Please verify that both the operating system and the processor support Intel(R) X87, CMOV, MMX, FXSAVE, SSE, SSE2, SSE3, SSSE3, SSE4_1, SSE4_2, POPCNT, F16C and AVX instructions.
[host671.jc.rl.ac.uk:31272] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon on the local node in file ess_singleton_module.c at line 532
[host671.jc.rl.ac.uk:31272] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon on the local node in file ess_singleton_module.c at line 166”

It might be easily fixable somehow, or maybe it is more difficult to fix. I remember way back in 2017 that there were issues with optimization directives in fcm_make, or something like that.
Patrick

I located the source of the error message, it’s the mpirun command,
specifically
/apps/sw/eb/software/OpenMPI/3.1.1-iccifort-2018.3.222-GCC-7.3.0-2.30/bin/mpirun
which is used when the intel OpenMPI module is used. This mpirun isn’t
compatible
with AMD machines. One fix/hack (which I haven’t tried) would be to add
export PATH=/apps/sw/eb/software/OpenMPI/3.1.1-GCC-7.3.0-2.30/bin:$PATH
at the end of the env-script BUT only for the job execution, not for
compilation
in the job suite.

So basically the jasmin supplied Intel MPI environment is incompatible
with the AMD
nodes. As JULES is running on the serial queue on an AMD node, then it
could be
argued that it shouldn’t be running within the MPI environment, even if it’s
running on a single task. However if the AMD nodes are to be made
available for
parallel running with the Intel compiler and MPI, then this is definitely
an issue for jasmin to sort out.

See ~siwilson/bad-amd.sub for a demonstration of the problem.

Simon.

Hi Simon
This is wonderful! Thanks for figuring it out! I will try out your suggested fix.
Patrick

Hi Simon:
I tried your suggestion. It doesn’t appear to have worked, giving a similar error message as before.
See the new setup in:
~pmcguire/roses/u-al752AMD/site/suite.rc.CEDA_JASMIN

(BTW, when I run your test program:
~pmcguire/from_simon/bad-amd.sub , I get this error message, and the error message is the same, regardless of whether I select amd or intel as the SLURM constraint

“mpirun: error while loading shared libraries: libiomp5.so: cannot open shared object file: No such file or directory”

Is there something else I need to do?
)

The log file for the modified u-al752 suite is:
~/cylc-run/u-al752AMD/log/job/1/jules_at_neu_presc0/01/job.out

this log file gives this as the new PATH variable, which includes the new path overriding prior paths (/apps/sw/eb/software/OpenMPI/3.1.1-GCC-7.3.0-2.30/bin):

[INFO] export PATH=/home/users/pmcguire/cylc-run/u-al752AMD/share/fcm_make/build/bin:/apps/jasmin/metomi/bin:/apps/jasmin/metomi/rose-2019.01.8/bin:/apps/jasmin/metomi/bin:/apps/jasmin/metomi/bin:/home/users/pmcguire/cylc-run/u-al752AMD/bin:/home/users/pmcguire/cylc-run/u-al752AMD/bin:/apps/sw/eb/software/OpenMPI/3.1.1-GCC-7.3.0-2.30/bin:/apps/sw/eb/software/OpenMPI/3.1.1-iccifort-2018.3.222-GCC-7.3.0-2.30/bin:/apps/contrib/gnu/gcc/7.3.0/bin:/apps/contrib/gnu/binutils/2.31/bin:/apps/sw/intel/2019/intelpython3/bin:/apps/sw/intel/2019/advisor_2019.0.0.570901/bin64:/apps/sw/intel/2019/vtune_amplifier_2019.0.2.570779/bin64:/apps/sw/intel/2019/inspector_2019.0.0.569751/bin64:/apps/sw/intel/2019/itac/2019.0.018/intel64/bin:/apps/sw/intel/2019/clck/2019.0/bin/intel64:/apps/sw/intel/2019/compilers_and_libraries_2019.0.117/linux/bin/intel64:/apps/sw/intel/2019/debugger_2019/gdb/intel64/bin:/apps/jasmin/metomi/bin:/home/users/pmcguire/cylc-run/u-al752AMD/share/fcm_make/build/bin:/home/users/pmcguire/cylc-run/u-al752AMD:/apps/jasmin/metomi/cylc-7.8.12/bin:/home/users/pmcguire/cylc-run/u-al752AMD/bin:/home/users/pmcguire/miniconda3/bin:/home/users/pmcguire/miniconda3/condabin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/puppetlabs/bin:/home/users/pmcguire/bin:/apps/sw/intel/2019/parallel_studio_xe_2019.0.045/bin

Patrick

Hi again SImon:
OK. I got the AMD error message with a modified version of your test program:
~pmcguire/from_simon/bad-amd3.sub

#!/bin/bash
#SBATCH --job-name=test
#SBATCH --partition=par-single
#SBATCH --constraint=amd
#SBATCH --time=0:10:00
#SBATCH --ntasks=1

export PATH=/apps/jasmin/metomi/bin:$PATH
module load intel/19.0.0
module load contrib/gnu/gcc/7.3.0
module load eb/OpenMPI/intel/3.1.1
mpirun echo hello

Patrick

Hi again2 SImon:
And I can get rid the AMD error message with a modified version of your test program, using your suggested modification:
~pmcguire/from_simon/bad-amd4.sub

#!/bin/bash
#SBATCH --job-name=test
#SBATCH --partition=par-single
#SBATCH --constraint=amd
#SBATCH --time=0:10:00
#SBATCH --ntasks=1

export PATH=/apps/jasmin/metomi/bin:$PATH
module load intel/19.0.0
module load contrib/gnu/gcc/7.3.0
module load eb/OpenMPI/intel/3.1.1
export PATH=/apps/sw/eb/software/OpenMPI/3.1.1-GCC-7.3.0-2.30/bin:$PATH
mpirun echo hello

Patrick

Hi again3 Simon:
But when I put that suggested PATH modification in the suite, it doesn’t run JULES properly.

I even tried to put the following three lines in ~/cylc-run/u-al752AMD/share/fcm_make/build/bin/rose-jules-run:

export PATH=/apps/sw/eb/software/OpenMPI/3.1.1-GCC-7.3.0-2.30/bin:$PATH
mpirun echo mpirun1hello
exec rose mpi-launch -v jules.exe

But when I retrigger the jules app in the AMD suite, I do see “mpirun1hello” echo’d. But it still fails on the next line where it tries to run JULES.

Any suggestions?
Patrick

Hi.

Most odd. Try adding “–verbose” after mpi-launch which may provide some
more info.

Simon.

Hi Simon:
Does the “-v” do the same thing as “–verbose”? I was already using the “-v”.
Similar to before, the error log file ( ~pmcguire/cylc-run/u-al752AMD/log/job/1/jules_de_akm_presc0/06/job.err ) says:

[host642.jc.rl.ac.uk:10583] PMIX ERROR: NO-PERMISSIONS in file gds_dstore.c at line 702
[host642.jc.rl.ac.uk:10583] PMIX ERROR: NO-PERMISSIONS in file gds_dstore.c at line 711
[host642.jc.rl.ac.uk:14434] PMIX ERROR: NO-PERMISSIONS in file gds_dstore.c at line 702
[host642.jc.rl.ac.uk:14434] PMIX ERROR: NO-PERMISSIONS in file gds_dstore.c at line 711
[WARN] file:ancillaries.nml: skip missing optional source: namelist:jules_rivers_props
[WARN] file:imogen.nml: skip missing optional source: namelist:imogen_anlg_vals_list
[WARN] file:urban.nml: skip missing optional source: namelist:jules_urban_switches
[WARN] file:cable_prognostics.nml: skip missing optional source: namelist:cable_progs
[WARN] file:ancillaries.nml: skip missing optional source: namelist:jules_irrig_props
[WARN] file:urban.nml: skip missing optional source: namelist:jules_urban2t_param
[WARN] file:urban.nml: skip missing optional source: namelist:urban_properties
[WARN] file:ancillaries.nml: skip missing optional source: namelist:jules_crop_props
[WARN] file:crop_params.nml: skip missing optional source: namelist:jules_cropparm
[WARN] file:imogen.nml: skip missing optional source: namelist:imogen_run_list

Please verify that both the operating system and the processor support Intel(R) X87, CMOV, MMX, FXSAVE, SSE, SSE2, SSE3, SSSE3, SSE4_1, SSE4_2, POPCNT, F16C and AVX instructions.

[host642.jc.rl.ac.uk:21966] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon on the local node in file ess_singleton_module.c at line 532
[host642.jc.rl.ac.uk:21966] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon on the local node in file ess_singleton_module.c at line 166
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_init failed
  --> Returned value Unable to start a daemon on the local node (-127) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "Unable to start a daemon on the local node" (-127) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[host642.jc.rl.ac.uk:21966] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
[FAIL] rose-jules-run <<'__STDIN__'
[FAIL] 
[FAIL] '__STDIN__' # return-code=1
2023-03-27T15:04:15+01:00 CRITICAL - failed/EXIT

Patrick

-v is the same as --verbose, but as the extra switch now has double verbose “-v --verbose”
which actually interpreted as asking for extra verbosity, more output is generated.

However, “-v -v --debug” will provide the greatest amount of output.

Simon.

Thanks, Simon:
I had yesterday also tried -vv instead of -v, but that didn’t work.
Thanks for correcting my syntax.

With “-v -v --debug”, I get some better feedback in the job.out file.
See:
~pmcguire/cylc-run/u-al752AMD/log/job/1/jules_de_akm_presc0/13/job.out

I tried also adding the following to ~pmcguire/cylc-run/u-al752AMD/share/fcm_make/build/bin/rose-jules-run:
export PKG_CONFIG_PATH=/apps/sw/eb/software/OpenMPI/3.1.1-GCC-7.3.0-2.30/lib/pkgconfig:$PKG_CONFIG_PATH

I will try also updating the LD_LIBRARY_PATH and LIBRARY_PATH next. Are there other paths I should update?
I guess there isn’t a way to module load OpenMPI/3.1.1-GCC-7.3.0-2.30?
Patrick

Hi Simon
This change seems to work for AMD. It seems to be running without crashing (so far) on AMD. Thanks!

Change to ~pmcguire/cylc-run/u-al752AMD/share/fcm_make/build/bin/rose-jules-run:

export PATH=/apps/sw/eb/software/OpenMPI/3.1.1-GCC-7.3.0-2.30/bin:$PATH
export LIBRARY_PATH=/apps/sw/eb/software/OpenMPI/3.1.1-GCC-7.3.0-2.30/lib:$LIBRARY_PATH
export LD_LIBRARY_PATH=/apps/sw/eb/software/OpenMPI/3.1.1-GCC-7.3.0-2.30/lib:$LD_LIBRARY_PATH
export PKG_CONFIG_PATH=/apps/sw/eb/software/OpenMPI/3.1.1-GCC-7.3.0-2.30/lib/pkgconfig:$PKG_CONFIG_PATH

I will now try to move these changes to the file ~pmcguire/roses/u-al752AMD/site/suite.rc.CEDA_JASMIN and to the trunk of u-al752.

Is there some better way to handle this, for the general JULES u-al752 user on JASMIN?
Patrick

Hi Simon
Actually, that change doesn’t work. I tested it for AMD processors on short-serial-4hr. JULES starts to run, reading the namelists. But I guess it gets stuck somewhere, since it doesn’t produce any output or anything else in the log files and it goes to the wallclock limit of 4 hours. Normally the JULES run should take 10 minutes or so.
Maybe I shouldn’t be trying to change LD_LIBRARY_PATH or LIBRARY_PATH?
Patrick

I think the crux of the mater is that the working mpirun requires the
gcc OpenMPI LD_LIBRARY_PATH
but the Intel built exec requires the Intel OpenMPI LD_LIBRARY_PATH. I
trying to think
of a way to use the two in the JULES environment. LD_PRELOAD might work,
but it’s very messy.

I’ve contacted jasmin support over this. They really need to rebuild the
Intel OpenMPI
software stack without the Intel architecture switches.

Simon.