Switching from Archer-4c to Archer2

Hi,

Following the email of 16th December announcing that the UM could be run on Archer2, I’ve tried to switch to running from running on the 4 cabinet system to running on Archer2. In my suite u-ck528, in suite/archer2.rc, I changed host = login-4c.archer2.ac.uk to host = login.archer2.ac.uk and recreated my archerum key by reverting to having login.archer2.ac.uk (reversing the instruction in the red box at the top of the webpage http://cms.ncas.ac.uk/wiki/Archer2/SshAgentSetup). I believe the archerum key has been installed because running ssh login.archer2.ac.uk returns:

PTY allocation request failed on channel 0
Comand rejected by policy. Not in authorised list
Connection to login.archer2.ac.uk closed.

However, when I tried to run u-ck528, the attempt failed with:

[FAIL] ssh -oBatchMode=yes login.archer2.ac.uk bash --login -c ‘ROSE_VERSION=2016.11.1\ rose\ suite-run\ -v\ -v\ --name=u-ck528\ --run=run\ --remote=uuid=3be625bc-b4b5-4e8e-9fc5-f07f460f5a6f,root-dir=$DATADIR’ # return-code=255, stderr=

[FAIL] Host key verification failed.

This looks like an error I had previously which was solved by switching the host to login-4c.archer2.ac.uk but I assume this isn’t correct if I want to run on Archer2 not Archer-4c.

Are there any other changes required to allow suites which previously ran on Archer-4c to run on Archer2?

Many thanks for your help,

James

Hi James,

It looks like ssh is still wanting input from you to sort out host keys in the known_hosts files. It’s probably landed on a different login node when you submitted the job to when you ran the ssh on the PUMA command line.

On the PUMA command line login to each of the login nodes (1-4):

E.g. ssh login1.archer2.ac.uk

And follow any instructions that pop up.

Cheers,
Ros.

Thanks, Ros. I have run those commands. login1, 3 and 4 returned the following (substitute 1 for 3 or4):

The authenticity of host ‘login1.archer2.ac.uk (193.62.216.42)’ can’t be established.
ECDSA key fingerprint is SHA256:UGS+LA8I46LqnD58WiWNlaUFY3uD1WFr+V8RCG09fUg.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘login1.archer2.ac.uk’ (ECDSA) to the list of known hosts.
Warning: the ECDSA host key for ‘login1.archer2.ac.uk’ differs from the key for the IP address ‘193.62.216.42’
Offending key for IP in /home/jmw240/.ssh/known_hosts:4
Are you sure you want to continue connecting (yes/no)? yes
Permission denied (publickey).

I can see these are in the ~/.ssh/known_hosts file.

When I ran ssh login2.archer2.ac.uk, nothing happened and it eventually timed out.

u-ck528 returned the same error message when I tried to run it as before.

Cheers,

James

Hi James,

Sorry that should have said:

E.g. ssh -i ~/.ssh/id_rsa_archerum jweber@login1.archer2.ac.uk

as you don’t have your config set up for all the individual nodes. When it says about offending keys you need to remove those from the known_hosts file.

Once you have done that on all login nodes (login 2 is probably down) you should get the usual “…Command rejected by policy. Not in authorised list…” messages on login. There may be multiple keys to be removed.

Regards,
Ros

Thanks, Ros, I have run the command and removed the offending keys from known_hosts so that on rerunning the command, it get the “Command rejected by policy. Not in authorised list…” message. When I then run u-ck528, it starts but fails quickly on fcm_make_um with the error below:

[FAIL] config-file=/home/jmw240/cylc-run/u-ck528/work/19790101T0000Z/fcm_make_um/fcm-make.cfg:3
[FAIL] config-file= - https://code.metoffice.gov.uk/svn/um/main/branches/dev/simonwilson/vn11.1_archer2_compile/fcm-make/ncas-ex-cce/um-atmos-safe.cfg
[FAIL] https://code.metoffice.gov.uk/svn/um/main/branches/dev/simonwilson/vn11.1_archer2_compile/fcm-make/ncas-ex-cce/um-atmos-safe.cfg: cannot load config file
[FAIL] https://code.metoffice.gov.uk/svn/um/main/branches/dev/simonwilson/vn11.1_archer2_compile/fcm-make/ncas-ex-cce/um-atmos-safe.cfg: not found
[FAIL] svn: E215004: Authentication failed and interactive prompting is disabled; see the --force-interactive option
[FAIL] svn: E215004: Unable to connect to a repository at URL ‘https://code.metoffice.gov.uk/svn/um/main/branches/dev/simonwilson/vn11.1_archer2_compile/fcm-make/ncas-ex-cce/um-atmos-safe.cfg
[FAIL] svn: E215004: No more credentials or we tried too many times.
[FAIL] Authentication failed

[FAIL] fcm make -f /home/jmw240/cylc-run/u-ck528/work/19790101T0000Z/fcm_make_um/fcm-make.cfg -C /home/jmw240/cylc-run/u-ck528/share/fcm_make_um -j 4 mirror.target=login.archer2.ac.uk:cylc-run/u-ck528/share/fcm_make_um mirror.prop{config-file.name}=2 # return-code=1

This looks the involve the branches mentioned in the quick start section of http://cms.ncas.ac.uk/wiki/Archer2#ARCHER2-FullSystem. Do I need to change these too?

Cheers,

James

Hi James,

Cache your MOSRS credentials on PUMA and it should then work.

Regards,
Ros.

Hi Ros,

Thanks, that makes sense. I tried mosrs-cache-password to cache my mosors password but it returned:

gpg-agent: no process killed
Error: gpg-agent not working
Run “mosrs-cache-password” to try caching your password again

Could this be a forwarding problem?

James

Hi James,

You should just need to log out of PUMA and back in again. This should start up the gpg-agent and then prompt you for your mosrs password on login to PUMA.

However, it looks like you’ve been editing your .profile today? I can’t see the mosrs setup stuff in there that should be.

# MOSRS Setup
[[ "$-" != *i* ]] && return # Stop here if not running interactively
. mosrs-setup-gpg-agent

Regards,
Ros.

Hi Ros,

Thanks that has done it. I’m now prompted for my MOSRS password when logging in.

Cheers,

James