PPTransfer failure with UM to GPW JASMIN (ARCHER2)

Hello,

I’m trying to run the UM on ARCHER2 for the first time. I’ve copied a suite from Luke Abraham (u-dd635) which works and archives correctly to the group workspace on JASMIN. My copied suite is u-de764 and also I’m trying to archive to the same group workspace on JASMIN. The model setup is a nudged run - the run itself works fine. After the first (and second) month the workflow is getting stuck on the PPtransfer stage.

I’m been through the page to configure PPTransfer page and followed the instructions (https://cms.ncas.ac.uk/unified-model/pptransfer/) and my credentials are in date, but the model is still failing and I’m getting the following error:

[WARN] file:atmospp.nl: skip missing optional source: namelist:moose_arch
[WARN] file:atmospp.nl: skip missing optional source: namelist:script_arch
[WARN]  [SUBPROCESS]: Command: globus-url-copy -vb -cd -r -cc 4 -sync -sync-level 3 -cred /work/n02/n02/m_brown2/cred.jasmin /work/n02/n02/m_brown2/archive/u-de764/19820201T0000Z/ gsiftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/hecter/m_brown/ARCHER_archive/u-de764/19820201T0000Z/
[SUBPROCESS]: Error = 1:
	Error loading source credential: GSS failure: 
GSS Major Status: General failure
GSS Minor Status Error Chain:
globus_sysconfig: File has bad permissions: Permissions on /work/n02/n02/m_brown2/cred.jasmin are too permissive. Maximum allowable permissions are 600



[WARN]  Transfer command failed: globus-url-copy -vb -cd -r -cc 4 -sync -sync-level 3 -cred /work/n02/n02/m_brown2/cred.jasmin /work/n02/n02/m_brown2/archive/u-de764/19820201T0000Z/ gsiftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/hecter/m_brown/ARCHER_archive/u-de764/19820201T0000Z/
[ERROR]  transfer.py: Unknown Error - Return Code=1
[FAIL]  Command Terminated
[FAIL] Terminating PostProc...
[FAIL] transfer.py <<'__STDIN__'
[FAIL] 
[FAIL] '__STDIN__' # return-code=1
2024-03-26T12:24:54Z CRITICAL - failed/EXIT

I’ve checked and my permissions on cred.jasmin are set to 600 and it’s located under the work directory. I have permission for hpxfer and I can ssh onto the JASMIN servers fine from ARCHER2.

Please could someone help me to get the model to archive properly?

Many thanks,
Megan

Megan

on ARCHER2 please run the following & post the output here

openssl x509 -in cred.jasmin -noout -startdate -enddate

Grenville

Hi Grenville,

Here’s the output:

notBefore=Mar 21 09:11:21 2024 GMT
notAfter=Apr 20 09:11:21 2024 GMT

Megan

try on ARCHER - (post the result)

module load gct
globus-url-copy -vb -cd -r -cc 4 -sync -sync-level 3 -cred /work/n02/n02/m_brown2/cred.jasmin /work/n02/n02/m_brown2/archive/u-de764/19820101T0000Z/ gsiftp://gridftp1.jasmin.ac.uk/gws/nopw/j04/hecter/m_brown/ARCHER_archive/u-de764/19820101T0000Z/

then please try moving your .globus directory away & rerun the globus command above

Grenville

After the first attempt at the command, I get:

error: globus_ftp_control: gss_init_sec_context failed
GSS failure: 
GSS Major Status: General failure
GSS Minor Status Error Chain:
globus_gsi_gssapi: Error with gss context
globus_gsi_gssapi: Error with gss credential handle
globus_credential: Valid credentials could not be found in any of the possible locations specified by the credential search order.
Valid credentials could not be found in any of the possible locations specified by the credential search order.
Attempt 1
globus_credential: Error reading host credential
globus_sysconfig: Could not find a valid certificate file: The host cert could not be found in: 
1) env. var. X509_USER_CERT
2) /etc/grid-security/hostcert.pem
3) $GLOBUS_LOCATION/etc/hostcert.pem
4) $HOME/.globus/hostcert.pem

The host key could not be found in:
1) env. var. X509_USER_KEY
2) /etc/grid-security/hostkey.pem
3) $GLOBUS_LOCATION/etc/hostkey.pem
4) $HOME/.globus/hostkey.pem


Attempt 2
globus_credential: Error reading proxy credential
globus_sysconfig: Could not find a valid proxy certificate file location
globus_sysconfig: Error with key filename
globus_sysconfig: File does not exist: /tmp/x509up_u45867 is not a valid file
Attempt 3
globus_credential: Error reading user credential
globus_sysconfig: Error with certificate filename: The user cert could not be found in: 
1) env. var. X509_USER_CERT
2) $HOME/.globus/usercert.pem
3) $HOME/.globus/usercred.p12

And after I’ve moved my .globus directory (located in /home/n02/n02/m_brown2), I get the following:

Error loading source credential: GSS failure: 
GSS Major Status: General failure
GSS Minor Status Error Chain:
globus_gsi_gssapi: Unable to read credential for import
globus_gsi_gssapi: Error with gss credential handle
globus_gsi_gssapi: Error with GSI credential
globus_sysconfig: Could not find a valid trusted CA certificates directory: The trusted certificates directory could not be found in any of the following locations: 
1) env. var. X509_CERT_DIR
2) $HOME/.globus/certificates
3) /etc/grid-security/certificates
4) $GLOBUS_LOCATION/share/certificates

Megan

Megan

OK - two more things to try

    • regenerate the credential & try the globus-url-copy again

then we may need to engage JASMIN.
Grenville

I can ssh in hpxfer1 from puma2, but not from archer2 - the .ssh/config isn’t set up on in my home directory on archer2.

Before I regenerate the credential file, shall I move my .globus directory back to where it was originally or will this be automatically created when I redo the credentials?

Before I regenerate the credential file, shall I move my .globus directory back to where it was originally or will this be automatically created when I redo the credentials?

No - please let the process gerarate a new one.

I’ve recreated the credential file as well now. Output from openssl x509 -in cred.jasmin -noout -startdate -enddate:

notBefore=Mar 27 09:43:41 2024 GMT
notAfter=Apr 26 09:43:41 2024 GMT

Megan

Sorry, forgot to check the globus-url-copy command. The output is:


error: globus_ftp_control: gss_init_sec_context failed
GSS failure: 
GSS Major Status: General failure
GSS Minor Status Error Chain:
globus_gsi_gssapi: Error with gss context
globus_gsi_gssapi: Error with gss credential handle
globus_credential: Valid credentials could not be found in any of the possible locations specified by the credential search order.
Valid credentials could not be found in any of the possible locations specified by the credential search order.
Attempt 1
globus_credential: Error reading host credential
globus_sysconfig: Could not find a valid certificate file: The host cert could not be found in: 
1) env. var. X509_USER_CERT
2) /etc/grid-security/hostcert.pem
3) $GLOBUS_LOCATION/etc/hostcert.pem
4) $HOME/.globus/hostcert.pem

The host key could not be found in:
1) env. var. X509_USER_KEY
2) /etc/grid-security/hostkey.pem
3) $GLOBUS_LOCATION/etc/hostkey.pem
4) $HOME/.globus/hostkey.pem


Attempt 2
globus_credential: Error reading proxy credential
globus_sysconfig: Could not find a valid proxy certificate file location
globus_sysconfig: Error with key filename
globus_sysconfig: File does not exist: /tmp/x509up_u45867 is not a valid file
Attempt 3
globus_credential: Error reading user credential
globus_sysconfig: Error with certificate filename: The user cert could not be found in: 
1) env. var. X509_USER_CERT
2) $HOME/.globus/usercert.pem
3) $HOME/.globus/usercred.p12