I’m trying to set up archiving to jasmin on my suite u-cy223 which runs the global model + pan-Arctic/pan-Antarctic nests (as you’ve been helping me with here)
I want to be able to set up automatic archiving to jasmin, so I’ve been looking at the CANARI suite u-cn134, because it seems to have a lot of useful functionality.
@RosalynHatcher I know you’ve done a lot of work on this suite, how straightforward would it be to so something similar (i.e. output pp streams, convert and postprocess to netCDF and transfer to jasmin)?
And can you help me set it up to archive only model outputs?
It looks like u-cy223 is using the UM netcdf output - will you not carry on with that?
You could create a bespoke app (like get_era_data) that rsync’d just the files required - but that’s effectively just what the built in archiving does. We can try to work out the settings for archiving to shift the right files.
I don’t think trying to incorporate the data management from u-cn134 would be worthwhile. The u-cn134 data management is geared specifically for the CANARI project.
Thanks Grenville. I only want to transfer the output files - am currently in two minds about whether I should transfer as pp files that take up less space and then stashsplit them on jasmin as part of my processing, or if I should output as netcdf in the first place. Am currently leaning towards the former.
Sounds like it’s easier to just write a bespoke transfer script/app then - would very much appreciate some help on it! you can probably see how (not) far I’ve got already…
u-cy223 has successfully completed a week (current run is stalling cos I added in some new stash). Otherwise u-cy175 and 173 both finished a weeks’ worth of simulation I think…?
I have been trying to understand u-cy223 more and just wanted to check that this is the way you want the suite to run.
The cycle length (CYCLE_INT_HR) is set to 24hrs and CRUN_LEN is set to 6 hrs, yet this creates 6 runs per cycle:
-rw-r–r-- 1 shakka n02 305086464 Aug 16 10:33 umglaa_pa000
-rw-r–r-- 1 shakka n02 209293312 Aug 16 10:43 umglaa_pa006
-rw-r–r-- 1 shakka n02 211345408 Aug 16 10:48 umglaa_pa012
-rw-r–r-- 1 shakka n02 212590592 Aug 16 10:53 umglaa_pa018
-rw-r–r-- 1 shakka n02 213549056 Aug 16 10:59 umglaa_pa024
-rw-r–r-- 1 shakka n02 214069248 Aug 16 11:02 umglaa_pa030
this results in a doubling of some output.
Looking in /work/n02/n02/shakka/cylc-run/u-cy223/share/cycle/19991231T1200Z/glm/um/umglaa_pa018, I see output for U WIND on PRESSURE LEVELS (for example) for 01/01/2000 at 1200
but in /work/n02/n02/shakka/cylc-run/u-cy223/share/cycle/20000101T1200Z/glm/um/umglaa_pa000 , there is output for the same diagnostic at the same time, but the data is different.
It’s safe to assume I’m probably just doing something wrong here - I am trying to get 36 hour forecasts and discard the first 12 hours as spin-up, then output only t+12 to t+36 (which should be 00Z to 00Z) to file. I’ll then concatenate those together into a continuous timeseries.
There are also a few variables which aren’t outputting (anything on tiles, snow tiles or soil levels - I’m guessing this is a JULES thing?).
On archiving - the archiving step says ‘succeeded’ at every cycle, but I’ve evidently not set something up correctly because the transfer hasn’t happened.
Please see my copy of u-cy223 - where archive_files_rsync sends glm files to JASMIN and renames them from umglaa_pa000 to my_data_p000 etc
I needed to set DATADIR in the task environment since the glm does not do so – the LAM tasks do set DATADIR, so a separate task might be the way to go for them (needed anyway since the file renaming will be different I suppose.)
The renaming used python regular expression which aren’t the easiest.
I’d create a task for the LAMs that simply copies the data first then tackle that renaming later.
Right, so I need to set the archiving up to run separately for each region? Because if the files are named the same for both domains I don’t want it to overwrite one of the domains’ files when I transfer them.
Hi Grenville, just coming back to this now. I was wondering if it would make sense to do this as two separate archiving tasks - one for each domain. That way, the renaming by domain could be hard-wired in. What do you think to that idea?
And can you help me on this too?:
It’s safe to assume I’m probably just doing something wrong here - I am trying to get 36 hour forecasts and discard the first 12 hours as spin-up, then output only t+12 to t+36 (which should be 00Z to 00Z) to file. I’ll then concatenate those together into a continuous timeseries.
There are also a few variables which aren’t outputting (anything on tiles, snow tiles or soil levels - I’m guessing this is a JULES thing?).
Update: now testing using a separate archiving step for each domain. Copied suite u-cy223 and am now running glm with just one nested domain (Arctic) in u-cz478.
I added an archive_files_Arctic task with the same format as @grenville’s u-cy223 archive_files_rsync and updated the suite.rc to include an [[[environment]]] definition of DATADIR, which I had to hard-wire as
DATADIR=$ROSE_DATAC/Arctic/11km/ga7_24-36/um/
FYI I tried to be clever and use the syntax in /suite-runtime/lams.rc, i.e.
because I thought I could use this to make it work with both domains within one suite… but that didn’t work.
Now, it seems to be at least attempting to transfer the files as expected, but I’m getting an rsync error when trying to connect to either hpxfer1 or xfer1 on jasmin. I am able to connect from the archer2 command line to both without being prompted for a password, which means there shouldn’t be a connection or authentication issue.
[as an aside, the renaming should be renaming as MetUM_PolarRES_Arctic_11km_$CYCLE_out_fileid_000 , but seems to be missing the ‘o’ in ‘out’ - not sure why that’s happening but it doesn’t matter too much]
Try running the archive tasks on the login node - there is not agent running on the compute nodes. So, add method = background (to both archive tasks) eg
I tried this and I got an authentication error, despite being able to ssh into hpxfer1 directly from archer. Do I need to add something somewhere to allow the ssh agent to work in the background?
Hi Grenville,
Tried this, and I get the “Initialising new SSH agent…” on login, but am still getting a permission denied error when I try to ssh into hpxfer1. The security key is exactly as it should be, so I’m a little confused.
Cheers
Ella