Running on login-4c.archer2.ac.uk

Hi guys,

I managed to restart id_rsa_archerum, so that I have

marc@puma:/home/marc/roses/u-ck231> ssh-add -l
4096 SHA256:JtBQhDQKIHjiJfutS98+WBmUWfOuoDoPl5BKIXu6NfM /home/marc/.ssh/id_rsa_archerum (RSA)

but I’m still failing to submit jobs to login-4c.archer2.ac.uk with job.err output like

[WARN] symlink ignored: svn://puma/nemo.xm_svn/trunk/NEMOGCM/CONFIG/SHARED/1_namelist_ref@5518
[FAIL] login-4c.archer2.ac.uk:cylc-run/u-ck231/share/fcm_make_ocean: cannot create mirror target
[FAIL] ssh -n -oBatchMode=yes login-4c.archer2.ac.uk pwd # rc=255
[FAIL] Permission denied (publickey).

[FAIL] fcm make -f /home/marc/cylc-run/u-ck231/work/18500101T0000Z/fcm_make_ocean/fcm-make.cfg -C /home/marc/cylc-run/u-ck231/share/fcm_make_ocean -j 4 mirror.target=login-4c.archer2.ac.uk:cylc-run/u-ck231/share/fcm_make_ocean mirror.prop{config-file.name}=2 # return-code=2
Received signal ERR
cylc (scheduler - 2021-12-13T14:12:18Z): CRITICAL Task job script received signal ERR at 2021-12-13T14:12:18Z
cylc (scheduler - 2021-12-13T14:12:18Z): CRITICAL failed at 2021-12-13T14:12:18Z

I am able to interactively type `ssh -n -oBatchMode=yes login-4c.archer2.ac.uk pwd’, so I’m not sure what the problem is? Maybe something wrong with the mirror, although that seems to be fine.

Any ideas?
Thanks, Marc

Hi Marc,

What do you get when you type ssh login-4c.archer2.ac.uk on PUMA?

Cheers,
Ros.

Hi Ros,

Thanks for very quick response, I get

marc@puma:/home/marc/roses/u-ck231> ssh login-4c.archer2.ac.uk
PTY allocation request failed on channel 0
Comand rejected by policy. Not in authorised list
Connection to login-4c.archer2.ac.uk closed.

Thanks, Marc

Hmmmm, that’s the expected response.

Can you give me permission to read your /work and /home directories on ARCHER2 please?

chmod -R g+rX /work/n02/n02/marc
chmod -R g+rX /home/n02/n02/marc

Ta.
Cheers,
Ros.

Thanks Ros, that’s done.

Hmmmm I can’t see anything obvious. Have you tried it multiple times and also a rose suite-run --new to clear everything out just in case?

Cheers,
Ros.

Hi Ros,

Yes, I’ve tried both those things.

I’m finishing for today, so I’ll see if it works tomorrow.

Thanks for looking into it, Marc

Hi Marc,

So it works ok for me. My only other suggestions of things to try if it still fails in the morning is to try running the failed fcm command directly on the PUMA command line and see what happens. It also looks like you have 2 ssh-agents running on PUMA and I wonder if that is somehow causing issues…

ros@puma$ ps -flu marc
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
1 S marc     19866     1  0  80   0 -  5861 ?      13:19 ?        00:00:01 ssh-agent -s
1 S marc     28053     1  0  80   0 -  5861 ?      Nov01 ?        00:10:22 ssh-agent

Have a good evening.
Cheers,
Ros.

Hi Ros,

Can I just check, do you mean running something like the following?

marc@puma:/home/marc/cylc-run/u-ck230/work/18500101T0000Z/fcm_make_drivers> fcm make -f /home/marc/cylc-run/u-ck230/work/18500101T0000Z/fcm_make_drivers/fcm-make.cfg -C /home/marc/cylc-run/u-ck230/share/fcm_make_drivers -j 4 mirror.target=login-4c.archer2.ac.uk:cylc-run/u-ck230/share/fcm_make_drivers mirror.prop{config-file.name}=2
[init] make # 2021-12-14T10:07:24Z
[info] FCM 2017.10.0 (/home/fcm/fcm-2017.10.0)
[init] make config-parse # 2021-12-14T10:07:24Z
[info] config-file=/home/marc/cylc-run/u-ck230/work/18500101T0000Z/fcm_make_drivers/fcm-make.cfg
[FAIL] make config-parse # 0.3s
[FAIL] make # 0.3s
[FAIL] config-file=/home/marc/cylc-run/u-ck230/work/18500101T0000Z/fcm_make_drivers/fcm-make.cfg:4
[FAIL] config-file= - /Coupled_Drivers/fcm_make/driver.cfg
[FAIL] /Coupled_Drivers/fcm_make/driver.cfg: cannot load config file
[FAIL] /Coupled_Drivers/fcm_make/driver.cfg: cannot be read
[FAIL] No such file or directory

Thanks, Marc

Hi Marc,

Yes that was what I meant. Not sure that helped but I’ve just noticed you don’t have an entry for the 4c in your ~/.ssh/config file. Please try adding the following:

Host login-4c.archer2.ac.uk
User marc
IdentityFile ~/.ssh/id_rsa_archerum
ForwardX11 no
ForwardX11Trusted no

Cheers,
Ros.

Thanks Ros,

I’ve done this, logged out, restarted ssh-agent and ~/.ssh/id_rsa_archerum, but it’s still failing in the same way. I’ve run out of ideas for the moment.

Thanks, Marc

Continued and solved in ssh-agent setup