Pumatest account + set-up for running MetUM on ARCHER2

Hi Ella

Sorry for the slow reply - I was going to fix up by-395 for the 23-cab (hopefully tomorrow) - hopefully I won’t have to.

this is the immediate problem
site/ncas-cray-ex/suite-adds.rc: --account=n02

That needs to be --account='n02-PolarRES'

Grenville

Hi Grenville,
Great - I’ll change that for now and see what happens. Will you let me know when you’ve updated the suite?
BW,
Ella

Hi Ella

The archer2 branch of u-by395 is working now.

Grenville

Thanks Grenville, I will check out a new copy now.
E

Hi Ella

It’s not set up to use ANTS - if you need ANTS we’ll need to do more.

Grenville

Hi Grenville,

I’m not sure at the moment whether I’ll need ANTS. It will be good to make my own ancillaries eventually but I’m certainly not at that stage yet. For the time-being my priority is just to get a basic working configuration set up so that I can do some initial tests.

At the moment I’m still stuck on submit-retrying with the fcm-make2 and ancil steps. There’s not much helpful info in the activity logs, it just says the submit timed out. (u-cn662).

Ella

have you set you and your archer budget?

I have - no other references to ncas-cms or anyone else’s username as far as I can see…

No point letting it keep resubmitting - it thinks you are nobody

Suite    : u-cn662
Task Job : 20150323T0000Z/fcm_make2_um/07 (try 1)
User@Host: nobody@nid002477

Try stopping the suite and `rose suite-run --new`

I tried re-running as you suggest. Unfortunately I’m getting the same problem. Where did you find the reference to nobody@nid# ? I imagine I need to update something somewhere with my credentials?

In the log file: /home/n02/n02/shakka/cylc-run/u-cn662/log/job/20150323T0000Z/fcm_make2_um/02/job.out

It only thinks you are nobody when trying to run on the compute node. Please try this:

cd /work/n02/n02/shakka/cylc-run/u-cn662/log/job/20150323T0000Z/fcm_make2_um/02
sbatch job

Ahh, so it can only be viewed once the job has finished, got it.

I’ve tried doing that (from archer) - what am I expecting to see? So far I have a confirmation that the job is submitted but I don’t see a command line traceback or any changes in the suite control window.

…and now we’re back to submit-retrying.

Something weird going on - we may need to consult ARCHER – that won’t happen 'til next week.

Okay, thanks for investigating. Will check back in with you next week. Have a great bank holiday weekend. Ella

Hi Grenville, hope you had a nice weekend. Any updates from archer regarding this strange behaviour? E

Hi Ella

Please take a copy of /work/n02/n02/grenvill/eg.slurm – in your work space (change #SBATCH --account=n02-cms) , then

sbatch eg.slurm

This is a simple case - what happens?

Grenville

Hi Grenville, I have done this and submitted (from my home directory on archer, i.e. /home/n02/n02/shakka). I get a warning about the working directory not being visible to compute nodes but it does submit the job. Is the “hello Ella” supposed to print to the console? I haven’t seen any output from it. E

you need to put eg.slurm in your /work & sbatch it from there

Apols, that was an obvious mistake. I’ve done that and get an output to .out file that behaves as expected. What does this mean for the UM jobs?