Hi Ella
Sorry for the slow reply - I was going to fix up by-395 for the 23-cab (hopefully tomorrow) - hopefully I won’t have to.
this is the immediate problem
site/ncas-cray-ex/suite-adds.rc: --account=n02
That needs to be --account='n02-PolarRES'
Grenville
Hi Grenville,
Great - I’ll change that for now and see what happens. Will you let me know when you’ve updated the suite?
BW,
Ella
Hi Ella
The archer2 branch of u-by395 is working now.
Grenville
Thanks Grenville, I will check out a new copy now.
E
Hi Ella
It’s not set up to use ANTS - if you need ANTS we’ll need to do more.
Grenville
Hi Grenville,
I’m not sure at the moment whether I’ll need ANTS. It will be good to make my own ancillaries eventually but I’m certainly not at that stage yet. For the time-being my priority is just to get a basic working configuration set up so that I can do some initial tests.
At the moment I’m still stuck on submit-retrying with the fcm-make2 and ancil steps. There’s not much helpful info in the activity logs, it just says the submit timed out. (u-cn662).
Ella
have you set you and your archer budget?
I have - no other references to ncas-cms or anyone else’s username as far as I can see…
No point letting it keep resubmitting - it thinks you are nobody
Suite : u-cn662
Task Job : 20150323T0000Z/fcm_make2_um/07 (try 1)
User@Host: nobody@nid002477
Try stopping the suite and `rose suite-run --new`
I tried re-running as you suggest. Unfortunately I’m getting the same problem. Where did you find the reference to nobody@nid# ? I imagine I need to update something somewhere with my credentials?
In the log file: /home/n02/n02/shakka/cylc-run/u-cn662/log/job/20150323T0000Z/fcm_make2_um/02/job.out
It only thinks you are nobody when trying to run on the compute node. Please try this:
cd /work/n02/n02/shakka/cylc-run/u-cn662/log/job/20150323T0000Z/fcm_make2_um/02
sbatch job
Ahh, so it can only be viewed once the job has finished, got it.
I’ve tried doing that (from archer) - what am I expecting to see? So far I have a confirmation that the job is submitted but I don’t see a command line traceback or any changes in the suite control window.
…and now we’re back to submit-retrying.
Something weird going on - we may need to consult ARCHER – that won’t happen 'til next week.
Okay, thanks for investigating. Will check back in with you next week. Have a great bank holiday weekend. Ella
Hi Grenville, hope you had a nice weekend. Any updates from archer regarding this strange behaviour? E
Hi Ella
Please take a copy of /work/n02/n02/grenvill/eg.slurm
– in your work space (change #SBATCH --account=n02-cms
) , then
sbatch eg.slurm
This is a simple case - what happens?
Grenville
Hi Grenville, I have done this and submitted (from my home directory on archer, i.e. /home/n02/n02/shakka). I get a warning about the working directory not being visible to compute nodes but it does submit the job. Is the “hello Ella” supposed to print to the console? I haven’t seen any output from it. E
you need to put eg.slurm in your /work & sbatch it from there
Apols, that was an obvious mistake. I’ve done that and get an output to .out file that behaves as expected. What does this mean for the UM jobs?