Modification required to run a Monsoon suite on ARCHER2

Hi CMS,

I am trying to modify a suite (Which was running on MONSOON) to run on ARCHER2. I am carrying forward Iodine development work by Ewa (she has done all development on MONSOON and a rose suite of her job is u-cn823). I have taken a copy of job u-cn823 and the new one is u-cp798. How to modify this suite to run on ARCHER2? I think I have to modify ‘Host Machine’ for ARCHER2 and some (maybe more!) directories to run this on ARCHER2.

Can you please point me to any page available to modify the suite to run on ARCHER2? I think some instructions to modify a Monsoon suite to run on ARCHER2 were available on http://cms.ncas.ac.uk/wiki/Archer2 but it seems this page is moved somewhere.

Regards, Alok

please see Porting

Hi Grenville,

Thanks for pointing to this page. I have followed the instructions and modified the suite (i.e. u-cp798). The suite I am converting is a copy of a Monsoon suite. I am unable to submit the job on ARCHER2 and getting the following error message:

[INFO] REGISTERED u-cp798 → /home/akpandeyjnu/cylc-run/u-cp798
[FAIL] cylc validate -o /tmp/tmph4sFgn --strict u-cp798 # return-code=1, stderr=
[FAIL] Jinja2Error:
[FAIL] File “/home/akpandeyjnu/cylc-run/u-cp798/site/archer2.rc”, line 159, in top-level template code
[FAIL] execution time limit = {{CLOCK}}
[FAIL] UndefinedError: ‘AINITIAL’ is undefined

Please help me.

Regards, Alok

Dear CMS,
I think the problem is with startdump file and I have modified the archer2.rc file in directory /home/akpandeyjnu/cylc-run/u-cp798 with the startdump file - ‘/work/n02/n02/alok/ukca_start/ce185a.da20091201_00’ but after running the ‘rose suite-run’ the archer2.rc file automatically restores to previous one and still getting the same error. Is it defined somewhere in rose?
Please point me the directories for CMIP6_AEROCHEM_EMS, CMIP6_BIOG_EMS, CHEM_INIT_FILE, SST_SICE_ANCIL directories on archer2.

Regards, Alok

Hi Alok,

Any changes you make to the suite must be in the ~/roses/u-cp798 directory. Change the site/archer2.rc file in there. The files under ~/cylc-run are then generated from these.

You should find all the equivalent ancil paths under the central UMDIR on ARCHER2: /work/y07/shared/umshared

e.g. /projects/ancils/cmip6 on Monsoon is /work/y07/shared/umshared/cmip6 on ARCHER2

Regards,
Ros.

Hi Ros,

Thank you for pointing ancil path. I have modified archer2.rc file under /home/akpandeyjnu/roses/u-cp798/site and now able to submit the job from pumatest, but fcm_make_um fails and job.err file has the following message:

[FAIL] config-file=/home/akpandeyjnu/cylc-run/u-cp798/work/20091201T0000Z/fcm_make_um/fcm-make.cfg:3
[FAIL] config-file= - svn://pumatest/um.xm_svn/main/branches/dev/simonwilson/vn11.0_archer2_compile/fcm-make/ncas-ex-cce/um-atmos-safe.cfg
[FAIL] svn://pumatest/um.xm_svn/main/branches/dev/simonwilson/vn11.0_archer2_compile/fcm-make/ncas-ex-cce/um-atmos-safe.cfg: cannot load config file
[FAIL] svn://pumatest/um.xm_svn/main/branches/dev/simonwilson/vn11.0_archer2_compile/fcm-make/ncas-ex-cce/um-atmos-safe.cfg: not found
[FAIL] svn: warning: W170000: URL ‘svn://pumatest/um.xm_svn/main/branches/dev/simonwilson/vn11.0_archer2_compile/fcm-make/ncas-ex-cce/um-atmos-safe.cfg’ non-existent in revision 112491
[FAIL]
[FAIL] svn: E200009: Could not display info for all targets because some targets don’t exist

[FAIL] fcm make -f /home/akpandeyjnu/cylc-run/u-cp798/work/20091201T0000Z/fcm_make_um/fcm-make.cfg -C /home/akpandeyjnu/cylc-run/u-cp798/share/fcm_make_um -j 4 mirror.target=login.archer2.ac.uk:cylc-run/u-cp798/share/fcm_make_um mirror.prop{config-file.name}=2 # return-code=1
2022-09-16T15:19:32Z CRITICAL - failed/EXIT

The job-activity.log has the following information:
[jobs-submit ret_code] 0
[jobs-submit out] 2022-09-16T15:19:30Z|20091201T0000Z/fcm_make_um/01|0|20124
2022-09-16T15:19:30Z [STDOUT] 20124
[((‘event-mail’, ‘failed’), 1) ret_code] 0

Please could you suggest to me the possible cause?

Regards, Alok

Hi Alok,

The error is because it can’t find the compile configs for ARCHER2. We do not have UM vn11.0 installed on ARCHER2. You will need to upgrade to a newer version of the UM.

Regards,
Ros

Hi Ros,

Thanks for this. I have modified the rose-app.conf file with the AMIP job app/fcm_make_um/rose-app.conf. It succeeded for fcm_make_um, fcm_make2_um and install_ancil. Now I am getting error in recon and the error message is:
The following have been reloaded with a version change:

  1. cce/11.0.4 => cce/12.0.3
    [WARN] file:STASHC: skip missing optional source: namelist:exclude_package(:slight_smile:
    [WARN] file:RECONA: skip missing optional source: namelist:trans(:slight_smile:
    [FAIL] file:STASHmaster=source=fcm:um.xm_br/dev/ewabednarz/vn11.0_DEST_plus_Iodine/rose-meta/um-atmos/vn11.0/etc/stash/STASHmaster@HEAD: bad or missing value
    2022-09-21T13:51:37Z CRITICAL - failed/EXIT

Please note the chemistry scheme that was developed at vn11.0 by my colleagues on MONSOON and I am trying to run the same job on ARCHER2.

Regards, Alok

Alok

There is a procedure for upgrading a suite - you appear to not have followed it. Please look at Upgrading — Rose Documentation 2.0.0 documentation.

Also to fix the problem with the STASHmaster:

  1. Add the following to the rose-suite.conf file:
[file:app/um/file/STASHmaster]
source=fcm:um.xm_br/dev/ewabednarz/vn11.0_DEST_plus_Iodine/rose-meta/um-atmos/vn11.0/etc/stash/STASHmaster@HEAD
  1. Remove the following lines from the app/um/rose-app.conf file:
[file:STASHmaster]
source=fcm:um.xm_br/dev/ewabednarz/vn11.0_DEST_plus_Iodine/rose-meta/um-atmos/vn11.0/etc/stash/STASHmaster@HEAD

And obviously will need to upgrade that vn11.0 branch first.

Grenville