Archer2 storage space and MetUM running requirements for PolarRES

Yet another config question I’m afraid. I’m still working on producing my own ancillaries from high—resolution orography data for the Arctic and Antarctic but more work to do before I can get them into ancil format. In the meantime, I’ve tried to run Andrew’s Antarctic CORDEX simulations for December 2000 using the ancillaries , SST, SIC and forcing data that he already has, but keep stumbling up against ‘file not found’ errors. It first happened during the make ancils steps, which I was able to overcome because I have the ancils produced by Andrew’s previous simulations. However, I’m now getting the same error in the glm recon step, specifically “Failed to open file /work/n02/n02/shakka/cylc-run/u-co137/share/cycle/20001201T0000Z/glm/ics/glm_atmanl”. I’ve verified that the file is definitely there on archer – it’s almost like my account on pumatest isn’t able to ‘see’ the file system on archer – can you think of any reasons for this?

Best wishes,

Ella

In /work/n02/n02/shakka/cylc-run/u-co137/log/job/20001201T0000Z/Antarctic_CORDEX_0p11deg_ancil_orog/01/job.out
There is a message about a missing file.
Your file doesn’t really exist as a file: it’s a link. Specifically, it’s a link to qrparm.mask_igbp , which is itself a link to
/home/n02/n02/shakka/remaking_ancils/mask
and this is in the /home directory, so can’t be seen from the compute nodes.

It needs to be on /work , so perhaps you should make a directory on /work and put your files here?

Hi Dave, the symlink was me, but I hadn’t remembered that the compute nodes can’t see /home ! A rookie mistake. I’ll change that now.
Thanks,
Ella

Hi CMS, returning to this ticket as I still have some storage questions.

I am expecting that I will be producing a large amount of data in my polarRES simulations, and I am wondering:

a) is there a script somewhere that can collect the umnsaa* outputs from subdirectories within the cylc-run directory and concatenate it like on Monsoon/MASS?

b) has anyone done any automation for archiving / transferring files from archer for storage and post-processing? Eventually the data need to be archived, but I need to post-process them (CMOR-ise and extract individual variables) and check them first, probably on Jasmin.

c) where do other modelling groups put their data during the production of long simulations like this? Is it best to keep it on archer or jasmin (for instance)? And if there’s a place people often store long simulations’ data, can I access this?

At the moment I’m testing different configurations with different initialisation / forecast times for a one-month period - one month of data on my Antarctic domain (640x702) is about 45 GB, but I’m looking at reducing the domain size. The Arctic domain is 540x544 so should be a similar(ish) size. I’ll be doing the period 2000-2021 for both of those domains, plus eventually running 1985-2100 for both too, so it will be a non-trivial amount of data!

Best wishes,
Ella

Hi Ella,

a) I’m not completely sure what you mean here. Do you mean the archive structure of the data in MASS? Or is it some script that is run separately on Monsoon before data is put into MASS? If the later could you point us to the script used on Monsoon or point us to the task that does this in a suite?

b) Coincidently someone else asked about automating transfer of data from ARCHER2 to JASMIN in a nesting suite earlier this week. :upside_down_face: Pptransfer in UM nesting suite. However i’m just thinking now that rather than pfaffing with pptransfer you could try replacing the moo commands with rsync in the archive app to do the transfer. With rsync the task would have to then run as a background task on a specific ARCHER2 login node.

c) Typically people transfer their data to JASMIN, as large amounts cannot be kept on ARCHER2. For long term storage, or data that is not going to be used immediately, data should be migrated to Elastic Tape at JASMIN rather than left on disk.

Hope that helps,
Regards,
Ros.

Hi Ros, thanks for your response. On a): I seem to remember there is an archiving script that collates the data from subdirectories and prepares it before it gets archived - part of the ‘archive’ step I believe?

I’ll have a read-through of the pptransfer ticket (or I could just turn around and ask my colleague Ruth who is sitting behind me!) and see if I can find some space on jasmin somewhere.

Thanks as always for your help.
Best wishes
Ella