Running GLOMAP-aerosol version of RJ4.0 UM-UKCA on ARCHER-2 from temporary pumatest

Hi Ros,

A few months ago I added back in (see my job xpdb-a) the GLOMAP-aerosol to the v8.4 GA4 UM-UKCA job (Release Job 4.0) which seemed to be missing from the xoxta job that NCAS-CMS ported to ARCHER-2,

And then about a month or so ago, I completed the upgrade steps for specific jobs that upgrade from the “GLOMAP-mode v7newprim” codebase within the RJ4.0 release job, to then the standard UM-UKCA jobs with GLOMAP v8.1 from Yoshioka et al., 2019, JAMES (see my job xpdb-y – copy of Masaru’s base job teaf-w).

And then I furthermore completed the upgrade-steps to the main GLOMAP v8.2 codebase (see my job xphr-f – copy of Sandip’s SMURPHS base job xnbe-b) that was applied with “v3 of UM-UKCA” for the range of strat-aerosol/volc-aerosol papers in 2017-2021 (e.g. in quiescent conditions – Brooke et al., 2017, JGR; for the Tambora-ISA experiment – Marshall et al., 2018, ACP; Clyne et al., 2021, ACP – for the major-volcanic forcing PPE – Marshall et al., 2019, JGR; and for the SMURPHS/ACSIS volcanic forcing datasets Dhomse et al., 2020, ACP; Antuna-Marrero et al., 2021 EESD; Feng et al., zenodo-published waveband-mapped datasets ).

This was working fine in early February, and I set up and ran a set of GA4 UM-UKCA interactive stratospheric aerosol simulations for the Hunga-Tonga aerosol cloud (see UMUI experiment xphy)

Anyway – the reason I’m raising this post – as well as to flag up the availability of these standard GA4 UM-UKCA trop-aerosol & strat-aerosol “standard jobs” (at GA4 UM-UKCA in v8.4 of the UM) – is re: the problem with PUMA and the temporary switch to “pumatest”.

I was in touch with Andy Heaps last week (see emails below) with the initial submit of the standard “strat-trop v3” GA4 UM-UKCA job xphy-s (with GLOMAP v8.2) was failing – due to not being able to access an “allow_aerosols” user STASHmaster file in Mohit Dalvi’s home directory on PUMA.

It turned out that at that time, Mohit’s mdalvi directory had not yet been re-instated on to the pumatest machine.

After my email below asking for Mohit’s directory to be re-instated, that was then subsequently done, and then my submission of the standard xphy-s job worked fine.

However, I tried again on Friday afternoon, and it turns out that the job also uses 2 hand-edits and 3 user-STASHmaster files within Dan Partridge’s user directory dan2012

I managed to find the 2 hand-edits and 2 of the 3 user-STASHmaster files, and copied those over to my directory and the job then progressed a bit further – but still there is a 3rd user-STASHmaster from Dan’s user directory that I couldn’t find a copy of (kohler_dp_8.4)

***1) Please can you also re-instate Dan Partridge’s user-directory dan2012 on the temporary pumatest system? (as you already did for Mohit’s mdalvi directory) ***

Also – I did try just submitting without that user-STASHmaster – and the job fails with not finding the ukca directory for a series of hand-edits used in that standard job xphy-s/xnbe-b:

~ukca/hand_edits/VN8.4/config_new_diags_extra.ed
~ukca/hand_edits/VN8.4/sect35_on.ed
~ukca/hand_edits/VN8.4/config_strattrop_jpbk.ed
~ukca/hand_edits/VN8.4/CFC-114_not_in_Rad.ed
~ukca/hand_edits/VN8.4/UKCA_useUMvals.ed

2) Please can you also re-instate the ukca user-directory on pumatest

I set these to be N rather than Y – and that gets round the problem with the job xphy-s now completing the “UMUI process stage” OK.

However, when I click submit, there is then a more fundamental problem – a problem during the FCM build of the branches as the code progresses:

MAIN_SCR: Calling Extract …
Extracting UMATMOS base repository…
UMATMOS base repository extract failed
See extract output file /home/gmann/um/um_extracts/xphys/baserepos/UMATMOS/ext.out
MAIN_SCR: Extract failed
MAIN_SCR stopped with return code 25

I did check whether the FCM commands were working OK when I first ssh’d in to pumatest (see my emails below with Andy H) and the fcm status commands seemed to be working fine, in communicating with the PUMA FCM repository etc.

However – it looks like the FCM build process triggered from the UMUI is not able to progress at the moment – probably just missing a file or some other file-path problem?

Also – I noticed that PUMA UM Trac web-system doesn’t seem to be working.
When I point a browser to https://puma.nerc.ac.uk it just hangs.

I’ve attached a screen-shot of the

From: Graham Mann
Date: Wednesday, 16 February 2022 at 17:12
To: Andy Heaps
Subject: Re: pumatest

Hi Andy,

OK, great – thanks for your help with this.

I can already be getting on with the extra FCM-commits for the planned code-changes
for the shorter-duration (more highly concentrated) volcanic emission of SO2.

The Hunga-Tonga eruption emitted 0.2-0.4Tg of SO2 in just 20 minutes, compared to Pinatubo’s 14-23Tg of SO2 that was emitted over 9 hours — H-T was very explosive and also very concentrated chemistry!!

Cheers
Graham

From: Andy Heaps
Date: Wednesday, 16 February 2022 at 17:09
To: Graham Mann
Subject: Re: pumatest

Hi Graham,
I think Ros is going to do this tomorrow. Time to get back on strike!
Cheers
Andy

On 16/02/2022 16:54, Graham Mann wrote:
Hi Andy,

I’ve logged in to the pumatest machine from remote-access.leeds.ac.uk and can open the
UMUI OK (see the 2nd of the two attached screen-shot PNG files).

And I can also see that FCM commands seem to be working OK from pumatest

However, when I tried to submit a test UMUI job (my recent Hunga-Tonga control job xphy-s)
it fails when trying to access a file within Mohit Dalvi’s directory on PUMA.

This worked fine about 2 weeks ago, and I’m assuming then it’s simply a case of you not having
restored his user directory on the network visible from the pumatest machine.

See the 1st PNG file gives the error message it can’t access the file “allow_aerosols.stash”, which is a user STASHmaster file within Mohit Dalvi’s directory ~mdalvi/umui_jobs/prestash/vn8.1 directory.

Please can you restore Mohit Dalvi’s directory mdalvi on this temporary system?

Thanks a lot

Best regards,

Cheers
Graham

PS I’m supposed to be on strike too!

On 16/02/2022, 16:32, “Graham Mann” wrote:

Hi Andy,

OK -- thanks for this.

I'll try this now -- logging in from the new remote-access.leeds.ac.uk server.

Cheers
Graham

On 16/02/2022, 15:40, "Andy Heaps" wrote:

    Hi Graham,

    PUMA has some issues so moving to a temporary server and we've restored 
    your home directory from the backups on the evening of the 14th 
    February. Your ssh key should work as normal just that the server name 
    has changed to pumatest.nerc.ac.uk. I have reset your password to be 
    **** deleted for NCAS-CMS Helpdesk post **** Please change this when you login.

    Note:
    Your shell has changed to the bash shell
    The cylc-run directory isn't backed up so won't be present in your home 
    directory
    A further email will come in due course on how to restart any running suites
    I intend to go back on strike again tomorrow so if you have any queries 
    after the close of play today should go to the CMS help desk.

    Cheers
    Andy

    -- 
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Andy Heaps
    National Centre for Atmospheric Science (NCAS)
    Room 118
    Harry Pitt Building,
    Reading University,
    Earley Gate,
    PO Box 243,
    Reading RG6 6ET
    U.K.

    tel: 0118 378 6421
    fax: 0118 378 8316
    e-mail: andy.heaps@ncas.ac.uk
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Andy Heaps
National Centre for Atmospheric Science (NCAS) 
Room 118
Harry Pitt Building, 
Reading University, 
Earley Gate, 
PO Box 243, 
Reading RG6 6ET 
U.K. 
 
tel: 0118 378 6421
fax: 0118 378 8316 
e-mail: andy.heaps@ncas.ac.uk

Checking the ext.out file referred to in the error message, the FCM extract error is below.

See it says specifically “cannot locate config file”

I’m guessing this is simply a file-path problem

[FAIL] fcm:um_br/dev/um/vn8.4_machine_cfg/src/configs/bindings/container.cfg@22831: cannot locate config file, abort at /home/fcm/fcm-2019.09.0/bin/…/lib/FCM1/ConfigSystem.pm line 539

Note also that I noticed that the “twiddles”/“tilda” symbol doesn’t seem to work for Mohit’s directory – but that the directory is there now – and I can simply change the ~mdalvi/ instead to /home/mdalvi/ – and that worked OK once his user-directory was recovered/re-instated on pumatest.

But the ukca and dan2012 directories are simply not there when I do a list at /home on pumatest:

-bash-4.1$ pwd
/home
-bash-4.1$ ls
al17051 bassett cms_test1 earsshe fcm hburns jjas3 lost+found mhagdorn pz19486 ros taubry wb19586
amenon bjwmills dcase ee14s2r ggdjl hr20323 jmw240 lpb20 mjm qrw rrigby toddj willie
Amer7632 bk21486 dg19319 ee17vp ggpjv huiling jonah luciad mk1812 radiam24 seg transfer yb19052
andy c.c.symonds donners eelrm glxaf im13009 jonathan luciana n1280run rdeleon simon um yt910424
annette centos douglowe eewz gmann jeff jtalib markmuetz ncastr01 rdfjas simon.tett um1
antonypayne cfam duncanwp EmmaHoward grenville jfgu jwalton markr nd20983 reinhard sjh395 um.orig
aquota.user charles earfw emnicki ha392 jhlee langtont mbareford peterh ri5774 skakala umui
aschurer clang earjcti famous HBryant jjabram Leighton_Regayre mdalvi pmcguire robin swsvalde watson

Hi Graham,

  1. I have copied over directories /home/dan2012 and /home/ukca. For the moment you will have to reference them with the full path as ~ won’t work as I have only copied over the directories and not copied over the associated user login account. Andy will need to do that part when he’s back.

  2. I have resync’d the UM repository that the UMUI pulls from which will fix the fcm:um_br/dev/um/vn8.4_machine_cfg/src/configs/bindings/container.cfg@22831: cannot locate issue.

  3. Everything that was puma.nerc.ac.uk is now pumatest.nerc.ac.uk.

Give it another try and hopefully that fixes the submission issues.

Regards,
Ros.

Hi Ros,

OK, great – thanks a lot.

That works fine now re: accessing the files in the /home/ukca and /home/dan2012 directories.

I just tried to re-submit the updated v8.4 GA4 UM-UKCA Hunga-Tonga control run (xphy-s) with those paths updated, and it does now process through OK (with the links reverted back to use the directories in those dan2012 and ukca directories rather than the versions I copied to my user-directory).

Unfortunately I see that they’ve disabled log-in’s to ARCHER-2 temporarily (since 3.00pm today) so I’ve not yet been able to check re: the remedy for step 2).

But that sounds like that will fix the issue – so thanks!

Cheers
Graham

Hi Ros,

Just one follow-on question – the UM Trac system on PUMA doesn’t seem to be responding.

Do you know when the PUMA UM Trac is likely to be accessible again at https://puma.nerc.ac.uk ?

I did try http://pumatest.nerc.ac.uk and intriguingly it seems to give a large-font two-word answer “It works!”

Hi Graham,

The old PUMA server is down permanently. Equivalent on pumatest is here: https://pumatest.nerc.ac.uk/trac/UM but I can’t guarantee you it’ll work. As I’m sure you can appreciate we’re working hard to get everyone back up and running on pumatest.

Regards,
Ros.

Hi Ros,

With ARCHER-2 having come back up this morning, I’ve just tried submitting again the xphy-s interactive strat-trop aerosol CheST+GLOMAP-v8.2 “control simulation” for the Hunga-Tonga ensemble.

The re-sync of the UM repository did indeed fix the “cannot locate config file” problem, and the FCM extract does now begin and successfully extract the UMATMOS, JULES base repositories.

It then proceeds then to also extract the UMSCRIPTS repository including any branches.

However it then fails at the “Extracting UMATMOS including any branches” stage.

Checking the ~/um/um_extracts/xphys/umatmos/ext.out file, I see there are 2 problems:

  1. error message there says it cannot find the config file in the specific FCM branch “ARCHER_cce_vn804_acumps”:

/overrides/vn8.4/ARCHER_cce_vn804_acumps

  1. It has also has given a previous error message (see below).
    Possibly this is just informational/warning, but I guess it may be the reason for the FCM build failure.

“Use of uninitialized value in substitution iterator at /home/fcm/fcm-2019.09.0/bin/…/lib/FCM1/Util.pm line 119.”

Is this also a file-path error or similar re: the change to the pumatest machine?

Thanks for your help with this,

Best regards,

Cheers
Graham

Extract command started on Tue Feb 22 15:33:38 2022.
->Parse configuration: start
Config file (ext): svn://pumatest/UM_svn/UM/branches/dev/um/vn8.4_machine_cfg/src/configs/bindings/container.cfg@22831
Config file (ext): svn://pumatest/UM_svn/UM/branches/dev/um/vn8.4_machine_cfg/src/configs/bindings/UMATMOS_repos.cfg@22831
Config file (ext): svn://pumatest/UM_svn/UM/branches/dev/um/vn8.4_machine_cfg/src/configs/machines/init_options.cfg@22831
Config file (ext): svn://pumatest/UM_svn/UM/branches/dev/um/vn8.4_machine_cfg/src/configs/machines/cray-xc30-cce-archer/machine.cfg@22831
Config file (ext): svn://pumatest/UM_svn/UM/branches/dev/um/vn8.4_machine_cfg/src/configs/machines/cray-xc30-cce-archer/ext_libs/default_paths.cfg@22831
Config file (ext): /home/gmann/umui_jobs/xphys/USR_PATHS_OVRDS
Config file (ext): svn://pumatest/UM_svn/UM/branches/dev/um/vn8.4_machine_cfg/src/configs/machines/cray-xc30-cce-archer/ext_libs/gcom_mpp.cfg@22831
Config file (ext): svn://pumatest/UM_svn/UM/branches/dev/um/vn8.4_machine_cfg/src/configs/machines/cray-xc30-cce-archer/ext_libs/gcom_serial.cfg@22831
Config file (ext): /home/gmann/umui_jobs/xphys/USR_MACH_OVRDS
Config file (ext): /home/gmann/umui_jobs/xphys/FCM_UMATMOS_CFG
Config file (ext): svn://pumatest/UM_svn/UM/branches/dev/um/vn8.4_machine_cfg/src/configs/bindings/bind64_mpp_safe.cfg@22831
Config file (ext): svn://pumatest/UM_svn/UM/branches/dev/um/vn8.4_machine_cfg/src/configs/machines/cray-xc30-cce-archer/overrides64_safe.cfg@22831
Use of uninitialized value in substitution iterator at /home/fcm/fcm-2019.09.0/bin/…/lib/FCM1/Util.pm line 119.
[FAIL] /overrides/vn8.4/ARCHER_cce_vn804_acumps: cannot locate config file, abort at /home/fcm/fcm-2019.09.0/bin/…/lib/FCM1/ConfigSystem.pm line 539

Config file (ext): svn://pumatest/UM_svn/UM/branches/dev/um/vn8.4_machine_cfg/src/configs/machines/cray-xc30-cce-archer/overrides64_debug.cfg@22831
Config file (ext): /home/gmann/umui_jobs/xphys/USR_FILE_OVRDS
~

Hi Graham,

It’s another ~ issue. I copied Luke’s directory over the other day but the associated login account has yet to be migrated. Change the path to Luke’s UM User override files to /home/luke and try again.

Also Mark R’s perftools profiling override is currently not on pumatest - that’s profiling related so I think ok to switch that one off. I can retrieve that override if necessary.

Cheers,
Ros.

Hi Ros,

That’s great – thanks – I should’ve thought to check the over-ride files.
When I put in the explicit home-path that completed the FCM extract fine then (also setting the markr performance tools over-ride to “N”).

Thanks for your help,

Cheers
Graham