Setting up archer2-jasmin archiving in the nesting suite

I’m trying to set up archiving to jasmin on my suite u-cy223 which runs the global model + pan-Arctic/pan-Antarctic nests (as you’ve been helping me with here)

I want to be able to set up automatic archiving to jasmin, so I’ve been looking at the CANARI suite u-cn134, because it seems to have a lot of useful functionality.

@RosalynHatcher I know you’ve done a lot of work on this suite, how straightforward would it be to so something similar (i.e. output pp streams, convert and postprocess to netCDF and transfer to jasmin)?

And can you help me set it up to archive only model outputs?


It looks like u-cy223 is using the UM netcdf output - will you not carry on with that? ​

You could create a bespoke app (like get_era_data) that rsync’d just the files required - but that’s effectively just what the built in archiving does. We can try to work out the settings for archiving to shift the right files.

I don’t think trying to incorporate the data management from u-cn134 would be worthwhile. The u-cn134 data management is geared specifically for the CANARI project.

How much data do you anticipate moving to JASMIN?


Thanks Grenville. I only want to transfer the output files - am currently in two minds about whether I should transfer as pp files that take up less space and then stashsplit them on jasmin as part of my processing, or if I should output as netcdf in the first place. Am currently leaning towards the former.

Sounds like it’s easier to just write a bespoke transfer script/app then - would very much appreciate some help on it! you can probably see how (not) far I’ve got already…

have you got a suite that has run for a while to create its output over several cylces?

u-cy223 has successfully completed a week (current run is stalling cos I added in some new stash). Otherwise u-cy175 and 173 both finished a weeks’ worth of simulation I think…?

I don’t see any output in /home/n02/n02/shakka/cylc-run/u-cy175/share/cycle/*/Arctic/11km/ga7_24-36/um

there is glm/um output for u-cy175

I don’t seen any output for the 173 suite?

Ah yes, I had to delete it to make space to run the others! Doh. Hopefully will have some output from u-cy223 imminently…


I have been trying to understand u-cy223 more and just wanted to check that this is the way you want the suite to run.

The cycle length (CYCLE_INT_HR) is set to 24hrs and CRUN_LEN is set to 6 hrs, yet this creates 6 runs per cycle:

-rw-r–r-- 1 shakka n02 305086464 Aug 16 10:33 umglaa_pa000
-rw-r–r-- 1 shakka n02 209293312 Aug 16 10:43 umglaa_pa006
-rw-r–r-- 1 shakka n02 211345408 Aug 16 10:48 umglaa_pa012
-rw-r–r-- 1 shakka n02 212590592 Aug 16 10:53 umglaa_pa018
-rw-r–r-- 1 shakka n02 213549056 Aug 16 10:59 umglaa_pa024
-rw-r–r-- 1 shakka n02 214069248 Aug 16 11:02 umglaa_pa030

this results in a doubling of some output.

Looking in /work/n02/n02/shakka/cylc-run/u-cy223/share/cycle/19991231T1200Z/glm/um/umglaa_pa018, I see output for U WIND on PRESSURE LEVELS (for example) for 01/01/2000 at 1200

but in /work/n02/n02/shakka/cylc-run/u-cy223/share/cycle/20000101T1200Z/glm/um/umglaa_pa000 , there is output for the same diagnostic at the same time, but the data is different.

I guess you don’t want a continuous 12 month run?


Hi Grenville,

It’s safe to assume I’m probably just doing something wrong here - I am trying to get 36 hour forecasts and discard the first 12 hours as spin-up, then output only t+12 to t+36 (which should be 00Z to 00Z) to file. I’ll then concatenate those together into a continuous timeseries.

There are also a few variables which aren’t outputting (anything on tiles, snow tiles or soil levels - I’m guessing this is a JULES thing?).

On archiving - the archiving step says ‘succeeded’ at every cycle, but I’ve evidently not set something up correctly because the transfer hasn’t happened.

So far, the contents of my app/rose-app.conf are:

command-format=rsync %(sources)s %(target)s


I’m guessing that I’m missing something here?

Thanks for your help as ever

Ok, I’ve tried working through some of the archiving errors and now getting stuck with the renaming part.

I want to rename the files from each region (which have the same names) as <REGION_NAME>

i.e. 20000101T1200Z_Antarctic_m3h_000 etc.

but I can’t figure out a clean way to do this using the rose environment variables that are available.

Currently I’ve got:

command-format=rsync %(sources)s %(target)s

rename-format=%(cycle)s_(?P< tag>)_%(name)s

Can I define a REGNAME somehow ?


Please see my copy of u-cy223 - where archive_files_rsync sends glm files to JASMIN and renames them from umglaa_pa000 to my_data_p000 etc

I needed to set DATADIR in the task environment since the glm does not do so – the LAM tasks do set DATADIR, so a separate task might be the way to go for them (needed anyway since the file renaming will be different I suppose.)

The renaming used python regular expression which aren’t the easiest.

I’d create a task for the LAMs that simply copies the data first then tackle that renaming later.


Right, so I need to set the archiving up to run separately for each region? Because if the files are named the same for both domains I don’t want it to overwrite one of the domains’ files when I transfer them.

How can I handle that?

Hi Grenville, just coming back to this now. I was wondering if it would make sense to do this as two separate archiving tasks - one for each domain. That way, the renaming by domain could be hard-wired in. What do you think to that idea?

And can you help me on this too?:

It’s safe to assume I’m probably just doing something wrong here - I am trying to get 36 hour forecasts and discard the first 12 hours as spin-up, then output only t+12 to t+36 (which should be 00Z to 00Z) to file. I’ll then concatenate those together into a continuous timeseries.

There are also a few variables which aren’t outputting (anything on tiles, snow tiles or soil levels - I’m guessing this is a JULES thing?).

Update: now testing using a separate archiving step for each domain. Copied suite u-cy223 and am now running glm with just one nested domain (Arctic) in u-cz478.

I added an archive_files_Arctic task with the same format as @grenville’s u-cy223 archive_files_rsync and updated the suite.rc to include an [[[environment]]] definition of DATADIR, which I had to hard-wire as


FYI I tried to be clever and use the syntax in /suite-runtime/lams.rc, i.e.

DATADIR= {{regn[“name”]}}/{{resln[“name”]}}/{{mod[“name”]}}

because I thought I could use this to make it work with both domains within one suite… but that didn’t work.

Now, it seems to be at least attempting to transfer the files as expected, but I’m getting an rsync error when trying to connect to either hpxfer1 or xfer1 on jasmin. I am able to connect from the archer2 command line to both without being prompted for a password, which means there shouldn’t be a connection or authentication issue.

The error I’m getting is:

[FAIL] rsync -aLv /work/n02/n02/shakka/cylc-run/u-cz478/work/19991231T1200Z/archive_files_Arctic/tmpBZ0tbh/MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_day_000 /work/n02/n02/shakka/cylc-run/u-cz478/work/19991231T1200Z/archive_files_Arctic/tmpBZ0tbh/MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_mlev_000 /work/n02/n02/shakka/cylc-run/u-cz478/work/19991231T1200Z/archive_files_Arctic/tmpBZ0tbh/MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_m3h_000 /work/n02/n02/shakka/cylc-run/u-cz478/work/19991231T1200Z/archive_files_Arctic/tmpBZ0tbh/MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_mi1h_000 /work/n02/n02/shakka/cylc-run/u-cz478/work/19991231T1200Z/archive_files_Arctic/tmpBZ0tbh/MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_plev_000 /work/n02/n02/shakka/cylc-run/u-cz478/work/19991231T1200Z/archive_files_Arctic/tmpBZ0tbh/MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_i3h_000 /work/n02/n02/shakka/cylc-run/u-cz478/work/19991231T1200Z/archive_files_Arctic/tmpBZ0tbh/MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_m6h_000 /work/n02/n02/shakka/cylc-run/u-cz478/work/19991231T1200Z/archive_files_Arctic/tmpBZ0tbh/MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_3ht_000 /work/n02/n02/shakka/cylc-run/u-cz478/work/19991231T1200Z/archive_files_Arctic/tmpBZ0tbh/MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_i6h_000 /work/n02/n02/shakka/cylc-run/u-cz478/work/19991231T1200Z/archive_files_Arctic/tmpBZ0tbh/MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_6ht_000 # return-code=255, stderr=
[FAIL] ssh: connect to host port 22: Connection timed out
[FAIL] rsync: connection unexpectedly closed (0 bytes received so far) [sender]
[FAIL] rsync error: unexplained error (code 255) at io.c(228) [sender=3.2.3]
[FAIL] ! /gws/nopw/j04/polarres/ella/evaluation_run/Arcticfiles [compress=None, t(init)=2023-09-01T10:59:00Z, dt(tran)=0s, dt(arch)=130s, ret-code=255]
[FAIL] !	MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_3ht_000 (ut_3ht_000)
[FAIL] !	MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_6ht_000 (ut_6ht_000)
[FAIL] !	MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_day_000 (ut_day_000)
[FAIL] !	MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_i3h_000 (ut_i3h_000)
[FAIL] !	MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_i6h_000 (ut_i6h_000)
[FAIL] !	MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_m3h_000 (ut_m3h_000)
[FAIL] !	MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_m6h_000 (ut_m6h_000)
[FAIL] !	MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_mi1h_000 (ut_mi1h_000)
[FAIL] !	MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_mlev_000 (ut_mlev_000)
[FAIL] !	MetUM_PolarRES_Arctic_11km_19991231T1200Z_ut_plev_000 (ut_plev_000)
2023-09-01T11:01:12Z CRITICAL - failed/EXIT

… any ideas?

[as an aside, the renaming should be renaming as MetUM_PolarRES_Arctic_11km_$CYCLE_out_fileid_000 , but seems to be missing the ‘o’ in ‘out’ - not sure why that’s happening but it doesn’t matter too much]

Hi Ella

Sorry this is slow - holiday intervened.

Try running the archive tasks on the login node - there is not agent running on the compute nodes. So, add method = background (to both archive tasks) eg

        inherit = None, HOST_HPC
            execution retry delays = PT15M, PT15M, PT30M, PT60M, PT60M, PT180M, PT360M,PT360M
             method = background

Outrageous! Hope you enjoyed your time off :slight_smile:

I tried this and I got an authentication error, despite being able to ssh into hpxfer1 directly from archer. Do I need to add something somewhere to allow the ssh agent to work in the background?

In your .bashrc, add

# ssh-agent setup on login nodes
. ~/.ssh/ssh-setup

copy /home/n02/n02/grenvill/ssh-setup to your .ssh directory

Logout & back in. It should say Initialising new SSH agent..., then add your jasmin key.

If you don’t have a ~/.ssh/config, create one and add

User <your jasmin user name>
IdentityFile ~/.ssh/<your jasmin key>
ForwardAgent no

check that you can ssh without having to type a password/passphrase

then try archiving again

Hi Grenville,
Tried this, and I get the “Initialising new SSH agent…” on login, but am still getting a permission denied error when I try to ssh into hpxfer1. The security key is exactly as it should be, so I’m a little confused.

did you add your jasmin key?

ssh-add ~/.ssh/<your jasmin key>

then to check
ssh-add -l

what do you get for

ssh -vvv


Nope. Doh! How often will I have to re-add my key? Can I include it in my bashrc or something?

After adding my ssh key, I am now back to ‘connection timed out’ errors…